Production ML Projects

End-to-end machine learning solutions deployed at scale, delivering measurable business impact through innovative AI implementations.

BERT NLP Production ML 95.53% Accuracy

BERT Model Implementation for Article Relevance

September 2024

Challenge

The existing Logistic Regression model for article relevance detection was limiting prioritization accuracy at 82.84%, missing critical semantic nuances in content evaluation.

Solution

Transitioned from Logistic Regression to BERT (Bidirectional Encoder Representations from Transformers) to leverage superior language understanding for semantic analysis.

Implementation

  • Model Architecture: BERT-based transformer model with bidirectional context processing
  • Training Data: 24,000+ unique entries (February-August 2024) vs 5,200 in previous model
  • Deployment: Production deployment with comprehensive performance tracking
  • Testing: Validated on 2,419 samples with SME annotations

Results

95.53%
Final Accuracy
+2.2% vs Logistic Regression
1,105
Relevant Articles Detected
vs 1,030 previously
85.04%
Relevance Classification
Higher confidence predictions

Impact

Enhanced article prioritization with significantly better relevance detection, higher prediction confidence across global regions, and reduced misclassification risk for business-critical content.

AWS Lambda PostgreSQL Real-time ML Docker

Real-time ML Pipeline Architecture

2024

Objective

Build end-to-end ML pipeline for real-time article and feedback classification with automated storage and prediction capabilities.

Architecture

  • Data Ingestion: PostgreSQL triggers for automatic ML prediction invocation
  • ML Processing: AWS Lambda functions with Docker containers for model deployment
  • Model Loading: Pre-trained BERT models with text vectorizer optimization
  • Storage Design: Custom table structures for prediction results and metadata
  • Monitoring: Comprehensive logging and performance tracking via Tableau dashboards

Technical Implementation

Created Python Lambda functions packaged in Docker containers, deployed on AWS with PostgreSQL database triggers that automatically invoke ML predictions for new content. The system processes training data cleansing, model building, and automated deployment.

Performance

Processing Volume: Thousands of daily predictions
Response Time: Low-latency real-time processing
Accuracy: 95.53% prediction accuracy maintained
Scalability: Auto-scaling Lambda architecture

Business Value

Automated ML pipeline eliminated manual classification bottlenecks, ensuring real-time content evaluation with consistent accuracy and comprehensive audit trails for compliance and optimization.

GPT-4o-Mini Cost Optimization NLP 86.6% Savings

GPT-4o-Mini Transition for Article Summarization

2024

Challenge

High operational costs ($103/month) with GPT-3.5 for article summarization, plus limitations in handling large articles due to token constraints.

Solution Strategy

Strategic transition to GPT-4o-Mini to achieve cost efficiency while maintaining or improving summarization quality and expanding token capacity.

Implementation Details

  • Model Migration: Seamless transition from GPT-3.5-turbo to GPT-4o-Mini
  • Token Capacity: Expanded to 128k token limit for larger article processing
  • Quality Enhancement: Improved contextual understanding and summary coherence
  • Cost Monitoring: Real-time cost tracking and optimization metrics

Results Achieved

Before (GPT-3.5)

$103.00/month
Limited token capacity, basic summarization

After (GPT-4o-Mini)

$13.84/month
128k tokens, enhanced quality
86.6% Cost ReductionEnhanced QualityBetter Scalability

Quality Improvements

  • Better handling of complex, multi-topic articles
  • Enhanced contextual understanding for technical content
  • Improved summary coherence and relevance
  • Support for multilingual content processing
ML Integration Location Analytics 23% Time Savings PostgreSQL

Unified Article Prioritization System

June 2024

Problem Statement

Neptune's article prioritization relied on separate ML Priority, Location Priority, and Category Priority components, creating inefficiencies and inconsistent prioritization decisions.

System Redesign

Integrated machine learning predictions with refined location values from Named Entity Recognition (NER) using ChatGPT, connected to Unimap Priority Grid (Quadkey) for unified scoring.

Architecture Components

Location Priority

Spatial joins with Targeted Priority Area (TPA) scores and location values from Unimap Priority Grid

Reason Priority

ML-predicted categories with trust levels: 'good', 'out of scope', 'far in future', 'too vague'

Combined Priority

Unified scoring formula: ((TPA + Location)/2)/1000 + (Reason Priority * Weight)/5

Performance Validation

Testing Period: June 28, 2024 (4 days post-implementation)
Sample Size: 3,800 articles manually rated by Subject Matter Experts

Metric Classic Model Combined Model Improvement
Articles above 50% threshold 2,473 1,913 -560 articles (23% reduction)
Good articles missed 43 37 14% risk reduction
Prioritization accuracy Baseline +30% Significant improvement

Business Impact

23% Time Savings

SMEs review 560 fewer articles while maintaining quality

14% Risk Reduction

Fewer high-priority articles missed in evaluation

30% Accuracy Improvement

Enhanced prioritization precision through ML integration

Text Extraction Multilingual 44.77% Success Rate Python

Enhanced Fulltext Extraction System

2024

Challenge

Existing fulltext extraction system struggled with multilingual content and complex website structures, leaving thousands of URLs unprocessed.

Technical Enhancement

  • Library Integration: Advanced libraries including trafilatura and jusText for robust extraction
  • Multilingual Support: Enhanced character encoding handling for global content
  • Structure Recognition: Improved handling of diverse website layouts and content formats
  • Quality Filtering: Intelligent content validation and noise reduction

Performance Results

3,109
Previously Unprocessed URLs
1,392
Successful Extractions
44.77%
Success Rate Improvement

Technical Improvements

  • Robust character encoding detection and conversion
  • Advanced HTML parsing with multiple fallback strategies
  • Content quality scoring and validation
  • Scalable processing pipeline for batch operations

Impact

Significantly expanded text extraction capabilities, enabling processing of previously inaccessible multilingual content and complex web structures, directly improving data availability for downstream ML models and analysis workflows.