DISTRIBUTED SYSTEMS

ANALYTICS with GEN AI,
Product Management & Solution Architect.

A Polyglot Product Manager & Architect Specializing in Cloud Native, GenAI, & Enterprise Systems.

THE TECH RADAR

0.0

E-Commerce Scaling Solution

Problem: Handle 18T Load via Autoscaling Kubernetes Clusters.

Solution: Implemented HPA/VPA, optimized container images, and used Istio for traffic management.

GenAI Deploying & Managing LLMs

Problem: R&D Platform

Solution: Built a Kubeflow pipeline orchestrator with custom resource definitions using NVIDIA Triton for model serving.

SYSTEM DESIGN CASE STUDIES

🖥️

TECHNOLOGY

Choosing The Right Tool.

🔧

TECHNICALITY FIRST

Systems That Sustain Success.

📈

SCALABILITY FIRST

Systems That Sustain Success.

📊

OBSERVABILITY

Building Robust Systems.

🧠

AI & INSIGHTS

Refining Wisdom into Insights.

GEN AI & FUTURE TECH LAB

Coming soon...
ARCHITECT'S TOGGLE: BUSINESS/TECHNICAL
🧠

Private Projects Portfolio

Confidential projects and applications (Password Protected)

Product Management Projects

Strategic product initiatives and management case studies

🚧 WIP - WORK IN PROGRESS

AI & Enterprise Solutions

Distributed Systems, Big Data & Intelligent Applications

Explore our portfolio of AI-powered enterprise solutions spanning distributed systems, big data processing, intelligent UIs, and cloud-native applications. Each solution leverages cutting-edge AI technologies to solve complex business challenges.

🔍

RAG Application

AI-powered Retrieval-Augmented Generation (RAG) application for documents and data. Retrieve relevant information and generate context-aware responses using natural language queries, rather than relying on exact keywords.

☁️

AWS Cost Intelligence

Calculate and analyze your AWS spend in plain English. Ask questions about total cost, breakdown by service or region, trends, and forecasts. Uses AWS Cost Explorer via an MCP-backed AI agent.

📞

Voice Translation & Leave Management

Automatically process employee leave phone messages. Convert voice to text, extract leave details, and auto-enter with approval workflows.

📄

Legal Document Processing

AI-powered analysis and processing of legal documents. Extract key information, identify clauses, and automate document review workflows.

Beta - Advanced Features Coming Soon
🔍

Semantic Search Platform

AI-Powered Context-Based Search with Metadata Filtering

Upload CSV files and search through your data using natural language queries. Our semantic search understands context and meaning, not just exact keywords, and supports intelligent metadata filtering for precise results.

🔍

Context-Based Search

AI-powered semantic search that understands context and meaning, finding relevant results even when exact keywords don't match

Lightning Fast

Instant results with intelligent ranking, relevance scoring, and configurable result limits using "top N" syntax

🏢

Enterprise Ready

Secure AWS S3 integration, scalable architecture, and built for professional data workflows

🎯

Metadata Filtering

Dynamic filtering using natural language: "with order confirmation", "having DESKTOP-ABC123" - extract filters from your queries

📊

CSV File Processing

Upload CSV files, select columns for vectorization and metadata, with automatic S3 storage and local processing capabilities

🔧

Flexible Query Syntax

Support for complex queries: "ticket description with order confirmation top 5", natural language parsing with intelligent result limiting

AWS Cost Intelligence — MCP + Cost Explorer
☁️

AWS Cost Intelligence

Ask questions about your AWS spend in plain English. The AI agent queries AWS Cost Explorer in real time.

📊

Real-time cost data

Total spend, by service, by region, and trends

🔮

Forecasting

AI-generated spend forecast for the coming days

🏢

By account

Per-account breakdown for AWS Organizations

Contact

Get in touch with me

Dipankar Bhattacharya

Dipankar Bhattacharya

How to Use Semantic Search

📋 Overview

Semantic Search allows you to upload CSV files, vectorize data, and perform intelligent semantic searches using AI-powered embeddings.

Make sure you Process (index) your file before searching. Due to the free account I have limitation to keep more than 4 files indexed at the same time.

🚀 Getting Started

  1. Choose a CSV file: Click "Choose File" or select an existing file from S3.
  2. Select columns: After uploading, choose which columns to vectorize (for search) and which to use as metadata.
    Note: If a file was already uploaded before, the previously selected vector and metadata columns will be shown. You can change these selections if needed.
  3. Upload a CSV file: Click "Upload" to upload the file to S3. The file will be uploaded with your column selections.
  4. Process: After uploading, click "Process" to vectorize your data and create the search index. This step is required before searching.
  5. Search: Once processed, enter your query in natural language and click "Search". If the file was already processed before, you can skip to searching immediately without re-processing.

💡 Tip: If you've already uploaded and processed a file, you can either:

  • Change column selections, upload again, and process with new settings, OR
  • Just trigger a new search query without re-processing

💡 Query Examples

Simple Semantic Search

login failures
unauthorized access
data exfiltration

Single Metadata Filter

Use "contains", "with", or "=" to filter by specific fields:

device contains DESKTOP
username contains taylor
device='LAPTOP-MNO345'
risk_score=8.5

Multiple Metadata Filters

Combine multiple filters with "and", "or", or commas:

device='LAPTOP-MNO345' and username contains ctaylor
device contains DESKTOP and risk_score=8.6
username contains williams and device='DESKTOP-PQR678'
device='DESKTOP-DEF456', username contains jwilson

Semantic Search with Filters

Combine natural language search with metadata filters:

find login failures for username contains ctaylor
unauthorized access device='DESKTOP-ABC123'
data export username contains williams
failed authentication device contains LAPTOP

Working Examples with Test Data

These examples will return actual results from the sample data:

device='LAPTOP-MNO345' and username contains ctaylor
device='DESKTOP-DEF456' and username contains jwilson
device='DESKTOP-PQR678' and username contains williams
find login failures username contains taylor

Limit Results

Use "top n" to limit the number of results:

device contains DESKTOP top 5
unauthorized access top 3
username contains taylor top 2

Available Fields in Sample Data

Use these exact field names for filtering:

device: DESKTOP-ABC123, LAPTOP-XYZ789, Mobile-iPhone-12
username: jsmith@company.com, ctaylor@company.com
activity_details: (searchable text content)
risk_score: 5.7, 8.5, 9.8 (numeric values)
timestamp_utc_long: (timestamp values)

🔍 Features

  • CSV File Upload: Drag and drop or click to upload CSV files
  • Column Selection: Choose which columns to vectorize and which to use as metadata
  • Semantic Search: Search using natural language queries
  • Single Metadata Filtering: Filter results by specific fields using "contains", "with", or "=" syntax
  • Multiple Metadata Filtering: Combine multiple filters using "and", "or", or commas for precise results
  • Dynamic Filtering: Automatically searches all records when metadata filters are active
  • Top N Results: Control return limit with "top n" syntax

📝 Tips & Best Practices

  • Use natural language - the search understands context and meaning (e.g., "login failures", "unauthorized access")
  • Be specific with metadata filters for better results - use exact device names or usernames
  • Combine multiple filters with "and" for records that match ALL conditions
  • Use quotes for exact matches: device='DESKTOP-VWX012' (exact) vs device contains DESKTOP (partial)
  • Field names must be exact - use the column names from your CSV file exactly as they appear
  • Test with existing data first - try the "Working Examples" above to verify your setup
  • Use "top n" to limit results when searching large datasets (e.g., "top 5")
  • Select relevant columns for vectorization to improve search accuracy - typically text fields like descriptions or comments
  • If no results found: check that your filter values exist in the data (device names, usernames, etc.)

Contact

dipankar.bhattacharyya.career@gmail.com