Fraud Detection
System
Automated ML retraining pipeline with zero-downtime deployment, real-time monitoring, and GDPR compliance — built with Scrum + Kanban hybrid methodology.
🔍 Fraud Detection System
Project Proposal submitted to Mr. Marku
Reference: IT Projects Course
Assignment
Use Case: Retrain Fraud Model on New Patterns
Aim: To reduce the time required to update the system with new data, ensuring high accuracy for legitimate users while minimizing downtime.
Overview: This process involves retraining the classifier to identify fresh fraudulent transaction types that the current production model misses. The automation of this process significantly reduces financial losses by cutting down the delay between spotting new fraud patterns and deploying updated models.
Roles & Responsibilities
- Project Manager — Accountable for approving and deploying models to production
- ML Engineer — Responsible for training the Challenger model and comparing it vs Production
- MLOps Engineer — Manages model registration, deployment, performance monitoring, and rollbacks
- Fraud Analyst (Subject Matter Expert) — Reviews and labels fraud cases, and conducts bias/fairness audits
- Tech Lead — Responsible for triggering the retrain & evaluate pipeline
- Product Owner — Accountable for the final performance report and business value
- Data Engineer — Pulls the latest datasets from the Feature Store
- QA Engineer — Generates performance reports and validates model quality
- Automated Pipeline (System) — Orchestrates training, evaluation, and deployment processes
Pre-conditions
- New labeled data: The Fraud Analyst has already tagged recent suspicious transactions as "confirmed fraud"
- Dataset availability: This updated data is ready in the Feature Store
Scenario
- First, the Fraud Analyst finishes reviewing the fraud cases missed yesterday.
- The MLOps Engineer then triggers the "Retrain & Evaluate" pipeline.
- The System pulls the latest dataset from the Feature Store.
- A "Challenger" model is trained by the System to recognize these new patterns.
- The System compares this Challenger model against the active Production model (checking metrics like Recall and False Positive Rate).
- A report is generated showing that the new model detects the fraud without flagging legitimate users.
- Finally, the System registers the Challenger model as a release candidate.
Post-conditions
A new, more accurate model version is registered and staged, ready for zero-downtime deployment. The system maintains high availability while incorporating improved fraud detection capabilities.
Unit Testing Strategy
The unit tests verifying this scenario will be implemented in the next development phase. The full source code, including tests for model comparison and registration logic, will be hosted on GitHub. The repository link will be shared for code review once implementation is finalized.
Planned test coverage includes:
- Data pipeline validation and Feature Store integration
- Model training and evaluation metrics verification
- Challenger vs Production model comparison logic
- Model registry and versioning functionality
- Deployment rollback mechanisms
Yours sincerely,
Andrii Vlonha
Retrain Pipeline — Flow Diagram
Visual representation of the automated fraud model retraining and deployment pipeline.
Project Kanban Board
Current sprint status for the Fraud Detection System — Retrain Pipeline & Production Rollout.
To Do
5Doing
3Done
6Key Elements of IT Project Planning — Applied to Fraud Detection System
Each element below is tailored to the Fraud Detection System project (retrain pipeline & production rollout).
| Element | Owner | Purpose | Outputs | IT/Fraud Examples/Notes | Category |
|---|---|---|---|---|---|
| Identify & Analyze Stakeholders | Project Manager | Map everyone who influences or is impacted by the project to ensure proper engagement and avoid surprises. |
|
|
Foundation |
| Define Roles, Responsibilities & RACI | PM + Tech Lead | Eliminate confusion — clearly define who owns what to streamline collaboration in fast-paced IT environments. |
|
|
Foundation |
| Hold Kickoff Meeting | Project Manager | Align team on vision, scope, and processes to kickstart execution. |
|
|
Launch |
| Define Scope, Budget & Timeline | PM + PO | Set firm boundaries to manage expectations and prevent overruns. |
|
|
Core |
| Deliverables & Acceptance Criteria | PO + Tech Lead | Make success tangible by specifying outputs and how to verify them. |
|
|
Core |
| Create Schedule & Milestones | Project Manager | Break down work into actionable steps with timelines. |
|
|
Execution |
| Plan Resources & Team Capacity | PM + Tech Lead | Ensure availability of resources to avoid bottlenecks. |
|
|
Execution |
| Risk Assessment & Mitigation | PM + Security | Identify and mitigate threats early to protect project outcomes. |
|
|
Control |
| Quality & Success Metrics | Tech Lead + QA | Establish benchmarks to ensure the system meets high standards. |
|
|
Control |
| Communication Plan | Project Manager | Maintain transparency and quick issue resolution. |
|
|
Control |
RACI Matrix for Fraud Detection System
| Task | Project Manager | ML Engineer | MLOps Engineer | Fraud Analyst | Tech Lead | Product Owner | Data Engineer | QA Engineer |
|---|---|---|---|---|---|---|---|---|
| Review and label fraud cases | I | C | I | R A | C | C | C | I |
| Trigger retrain & evaluate pipeline | C | R | R A | I | C | I | I | C |
| Pull latest dataset from Feature Store | I | R | C | I | I | I | R A | |
| Train Challenger model | I | R A | C | C | C | I | I | I |
| Compare Challenger vs Production model | C | R | C | C | A | C | I | R |
| Generate performance report | C | C | C | C | C | A | I | R |
| Register Challenger model | I | R | R A | I | C | I | C | |
| Approve and deploy to production | C | C | R | I | R A | A | I | R |
| Monitor production performance | C | C | R A | R | C | C | I | C |
| Handle deployment rollbacks if needed | I | C | R A | I | R | C | I | R |
Project Priorities (Iron Triangle)
Primary Driver: Quality
In ML Fraud Detection, false negatives mean lost money, and false positives block real users. Scope & Quality are non-negotiable for a passing grade and business value.
Secondary Constraint: Deadline
The project is bound by the university academic calendar. The defense date is fixed, meaning timeline extensions are impossible.
Scope Actions (45%)
- Train Challenger model with 95%+ Precision/Recall.
- Build automated MLflow Retrain Pipeline.
- Implement Zero-Downtime Blue/Green deploy.
Time Actions (35%)
- Strict 4-Sprint lifecycle (2 weeks each).
- Deliver Core Pipeline MVP by Sprint 2.
- Final freeze 1 week before presentation.
Cost Actions (20%)
- Cap AWS/GCP usage at $500/month.
- Use Spot Instances for model training.
- Utilize open-source tools (Grafana, MLflow).
💰 Project Budget Breakdown
Monthly cost allocation for the Fraud Detection System — optimised for academic constraints with a strict $500/month cap.
GPU instances for model training and inference serving.
- AWS Spot Instances for training (60% savings)
- Auto-shutdown policies for idle servers
- Budget alerts at 80% and 100% thresholds
Feature Store, model artifacts, and training datasets.
- S3/MinIO for model artifact storage
- PostgreSQL for Feature Store metadata
- Lifecycle policies for old training data
MLflow, Airflow, Kubernetes cluster, and monitoring stack.
- Open-source tools: MLflow, Grafana, Prometheus
- Kubernetes namespace on shared cluster
- Airflow scheduler (small EC2 instance)
GDPR data masking, encryption, and audit logging.
- Data masking/hashing pipelines
- Encryption at rest (KMS)
- Audit log retention (90 days)
Buffer for unexpected GPU spikes, additional training runs, or scaling needs.
- 6% reserve of total budget
- Approval required from PM before use
- Rolled over if unused in a sprint
📅 Project Timeline & Schedule
8-week project schedule aligned with 4 Scrum sprints — each activity mapped to its execution window with clear milestones.
| Activity | W1 | W2 | W3 | W4 | W5 | W6 | W7 | W8 |
|---|---|---|---|---|---|---|---|---|
| 🏗️ Project Setup & Stakeholders | ||||||||
| ⚙️ MLflow & Pipeline Setup | ||||||||
| 📊 Data Labeling & Feature Store | ||||||||
| 🤖 Train Challenger Model | ||||||||
| 🔍 A/B Testing & QA Evaluation | ||||||||
| 📈 Grafana Monitoring Setup | ||||||||
| 🚀 Blue-Green Deploy to K8s | ||||||||
| 📋 Documentation & Runbooks | ||||||||
| 🧊 Code Freeze & Presentation | ||||||||
8. Risk assessment template
| What are the hazards? | Who might be harmed and how? | What are you already doing to control the risks? | What further action do you need to take to control the risks? | Who needs to carry out the action? | When is the action needed by? | Status |
|---|---|---|---|---|---|---|
| Customer's insolvency (Funding falls through) | Development Team & Agency: Loss of expected revenue, unpaid working hours, and abrupt project cancellation. | We hold regular monthly syncs with the client to assess their business health and project satisfaction. | Require a 30% upfront advance payment before commencing the next project phase; pause execution if invoices are >15 days late. | Project Manager / Finance | Project start | Done |
| Data Drift degrading model accuracy | Business: Missed fraudulent transactions leading to direct
financial loss. Users: Increased false positives. |
Data Scientists manually evaluate batch transaction data from the previous week to check for statistical deviations. | Implement automated concept drift detection (e.g., evidentlyAI) within the MLflow pipeline to trigger auto-retraining alerts. | MLOps Engineer | Sprint 2 | In Process |
| Cloud Compute (GPU) Budget Overrun | Company Financials: Exceeding the strict $500/month budget reduces overall project profitability. | Basic AWS/GCP billing alerts are configured to trigger emails at 80% and 100% of the budget threshold. | Transition training workloads exclusively to Spot Instances and enforce strict auto-shutdown policies for idle GPU servers. | DevOps Engineer | Sprint 1 | Done |
| Critical Spike in False Positive Rate (>1%) | Legitimate Customers: Payment rejections, account lockouts, and severe UX degradation leading to churn. | Evaluating the Challenger model using standard train/test split metrics on historical static datasets. | Setup Grafana real-time alerts for live FPR metrics and mandate a Shadow A/B testing phase before full traffic routing. | QA / ML Engineer | Sprint 3 | In Process |
| Production Downtime during deployment | E-commerce Platforms & Users: Unable to process real-time checkouts during the API outage window. | Manual deployments are scheduled exclusively during low-traffic night hours (3:00 AM) with manual rollback plans. | Architect and test a Kubernetes-based Blue-Green deployment strategy ensuring 100% zero-downtime updates. | Tech Lead / DevOps | Sprint 4 | Future |
| GDPR / PII Privacy Violation | Company: Heavy regulatory fines, legal action, and massive reputational damage. | Raw transaction data access is restricted exclusively to authorized senior Database Administrators. | Implement automated data masking and hashing pipelines in the Feature Store before data reaches the ML training environment. | Data Engineer / SecOps | Sprint 2 | In Process |
| API Inference Latency >50ms | End-users: Frustratingly slow checkout process leading to cart abandonment and lower conversion rates. | Utilizing a simplified baseline model architecture (e.g., XGBoost) to keep prediction times naturally low. | Optimize the final serialized deep learning model using ONNX Runtime or TensorRT to guarantee sub-50ms execution. | ML Engineer | Sprint 4 | Future |
| Unexpected departure of Key Team Member | Project Timeline & Team: Severe delays in pipeline delivery and loss of critical domain/architectural knowledge. | Conducting daily stand-up meetings to share current context, tasks, and immediate blockages across the team. | Enforce a strict "Bus Factor" policy by requiring detailed Runbooks, Architectural Decision Records (ADRs), and mandatory code reviews. | Project Manager | Sprint 1 | Done |
What the IT Project Should Produce — Expected Outcomes
Concrete deliverables and measurable results the Fraud Detection System project will produce upon successful completion.
Automated Retraining Pipeline
A fully automated CI/CD pipeline (Airflow + MLflow) that retrains the fraud classifier on new labeled data, evaluates the Challenger model, and registers it — with zero manual intervention.
⚡ 50% faster model updatesModel Versioning & Registry
Every trained model is versioned and tracked in MLflow with metadata (metrics, parameters, artifacts). Rollback to any previous version is possible within minutes.
🔒 Full audit trailReal-time Monitoring Dashboard
Grafana dashboard tracking live model performance: Recall, False Positive Rate, prediction latency, and data drift alerts — visible to all stakeholders.
🎯 99.9% uptime SLAZero-Downtime Deployment
Blue-green or canary deployment strategy on Kubernetes ensures the production system stays online during model updates. Automatic rollback triggered if FPR degrades.
⏱️ <50ms prediction latencyGovernance & Documentation
Complete project documentation: RACI matrix, risk register, communication plan, incident runbooks, GDPR compliance checklist, and unit-tested source code on GitHub.
✅ GDPR compliantImproved Fraud Detection Accuracy
The new model detects 95%+ of fraud cases including previously missed patterns, while keeping False Positive Rate below 1% — protecting legitimate users from being blocked.
📈 95% accuracy achievedAgile Hybrid (Scrum + Kanban)
How we combine Scrum's structure for academic milestones with Kanban's flow for ML model training.
🔄 Scrum Lifecycle
Scrum gives the team a structured, time-boxed approach to build and evaluate the fraud model in predictable increments.
Product Backlog
All desired features and tasks for the fraud system, prioritised by the Product Owner based on risk impact and technical dependency.
Sprint Planning
Scrum Master + PO + Dev Team commit to backlog items for the next 2 weeks. The Sprint Goal is defined.
Sprint Backlog (Kanban)
The committed subset of tasks for this Sprint. Each item becomes a card on the Kanban board with an assigned owner and progress tracker.
Sprint Execution & Daily Standup 2 weeks
The team executes. A 15-minute Daily Standup runs every morning to surface blockers before they stall progress.
Potentially Shippable Increment
At the end of each Sprint, the team delivers a working, tested increment that meets the Definition of Done:
Sprint Ceremonies
Structured check-ins to ensure alignment, demonstrate progress, and continuously improve processes.
Sprint Review
Last day of Sprint · ~2 hrsThe team demos the working increment to stakeholders. Backlog is updated based on feedback.
- Live demo of Challenger model metrics
- Showcase automated Retrain Pipeline
Sprint Retrospective
After Review · ~1 hrThe team reflects on process, tools, and relationships. Actionable improvements are identified for the next sprint.
Kanban Board & Principles
How we manage daily task flow, visualize bottlenecks, and limit Work In Progress (WIP).
Visualise the Workflow
Every task (Data Labeling, Model Training, API Deploy) is tracked on a shared board so anyone can instantly see the state of the system.
Limit Work In Progress (WIP)
Strict limits on active tasks prevent context switching. For example, WIP is capped at 3 in the "In Progress" column. If full, team swarms to finish tasks before starting new ones.
Manage Flow
By monitoring how long cards stay in a column (Lead Time / Cycle Time), we identify bottlenecks (e.g., waiting for QA) and adjust resources accordingly.
Live Board Columns (Jira):
Prioritised User Stories, tech debt, and bugs ready to be pulled into a Sprint.
Tasks selected for the current Sprint. Not yet started. Clear acceptance criteria defined.
Actively being worked on by an assigned owner. Blocked cards flagged red for Standup.
Awaiting verification — code review, test passing, or metric check by QA.
Meets Definition of Done. Either deployed or ready for zero-downtime release.
Delegation Worksheet
A preparation tool for project leaders and managers to set people up for success.
I am assigning: Olexiy (MLOps / Tech Lead)
the responsibility of: Developing and configuring the Real-time Monitoring Dashboard in Grafana for the Fraud Model.
Begin at the end: What outcomes are you looking for? What would success look like? How will you make the implicit explicit?
Success is a fully functional Grafana dashboard tracking real-time model metrics (Recall, False Positive Rate, Data Drift). The team receives timely Slack alerts in case of model degradation without false positives.
Why is this task important? Why X [name of staff person]? Why this? Why now?
Why important: In ML Fraud Detection, false negatives equal
financial losses. Quality is our Primary Driver.
Why Olexiy: He has the
deepest expertise in building ML monitoring and infrastructure.
Why now:
This is a critical prerequisite (blocker) to deploying the Automated Retraining Pipeline.
When does it need to be completed by? What are benchmarks along the way?
Completed by: By the end of the current sprint (Friday,
15:00).
Benchmarks:
- Tuesday: Data sources and logging connected
(Prometheus/MLflow).
- Thursday: Alerting rules and Data Drift simulations tested.
Where else can they go for resources, examples, or advice?
- Documentation: The "Governance & Documentation" section.
- Access: Request
cluster permissions from the DevOps Engineer.
- References: Existing Grafana dashboard
templates in the corporate knowledge base.
Who else should be involved? The MOCHA for this task is:
- Manager: Me (Project Manager)
- Owner: Olexiy
(MLOps Engineer)
- Consulted: Data Scientist (for setting Data Drift
thresholds)
- Helper(s): DevOps (deployment support and access)
-
Approver: Product Owner (accepting the result)
Are any specific approaches (mindsets, values, etc.) needed for this assignment? Remember to distinguish requirements from preferences or traditions.
Requirements: Configurations saved as Infrastructure as Code (IaC).
All releases via Zero-Downtime Deployment.
Mindset: Focus on Reliability
(alert reliability and speed are more important than visual aesthetics).
How will you seek their perspective and adapt to input?
I will ask: "What specific metrics can we add to better track this exact type of fraud based on your experience?" I'm open to replacing default approaches if he suggests a more effective solution.
How will you make sure you and your staff member are aligned on key points and next steps?
[✔] Verbal repeat-back (I will ask him to briefly describe how we will simulate Data Drift to test the alerts)
What specific products or activities will you want to review or see in action to monitor progress?
When and how will you debrief how things went? What questions will you ask? What feedback will you seek or offer?
When/How: At the next 1-on-1 after task delivery or during Sprint
Retrospective.
Debriefing questions:
1. What was the biggest challenge in
setting up Data Drift monitoring?
2. Did you receive prompt support from DevOps during
configuration?
3. What should we change or automate for similar monitoring in our next ML models?
Given the difficulty and importance of the task and my staff member’s will and skill for this project or assignment, my approach should generally be:
[✔] In the mix (Despite high skills, the critical business value and strict deadline dictate regular checkpoints to avoid quality risks)
🧪 Software Testing Strategy
V-Model + Non-Functional + ML-Specific disciplines applied to the Fraud Detection System — 7 categories radially distributed from a central hub, 22 test types total.
Certifications & Achievements
Professional certifications and completed courses in MLOps, Cloud Computing, and Software Engineering.
MLOps Specialization
DeepLearning.AI
Expected: 2025
AWS Solutions Architect
Amazon Web Services
Expected: 2025
Kubernetes Administrator (CKA)
Cloud Native Computing Foundation
Expected: 2025
Docker & Containerization
Docker Inc.
Expected: 2025
Python for Data Science
DataCamp / Coursera
Expected: 2025
Machine Learning Engineering
DeepLearning.AI
Expected: 2025
Terraform & Infrastructure as Code
HashiCorp
Expected: 2025
CI/CD & DevOps Fundamentals
GitHub / GitLab
Expected: 2025