The Future of Data & AI – Inside NeurDB
- Philip Ho
- Feb 21
- 4 min read
Updated: Feb 24

Right now, AI is changing everything—from predicting diseases in healthcare to recommending your next online purchase—but there’s one major problem:
AI models and databases don’t communicate efficiently.
Let’s take a real-world example. A doctor wants to predict whether a patient is at risk of diabetes using machine learning. Sounds simple, right?
But in reality:
The patient records are stored in a database.
The AI model lives outside the database.
The data has to be extracted, processed, and manually fed into the AI model.
Then the results need to be re-imported into the system for decision-making.
This entire process is slow, inefficient, and expensive.
What if the AI lived inside the database itself? That’s exactly what NeurDB does.

What Is NeurDB?
A team of researchers from the National University of Singapore, Zhejiang University, and Shanghai Jiao Tong University have created NeurDB, an AI-powered autonomous data system.
Their goal? To merge artificial intelligence and database systems into a single, seamless unit.
Why is this important?
AI applications today require heavy data movement between systems.
Databases need manual tuning and maintenance to work efficiently.
AI-driven analytics run separately from databases, increasing cost and latency.
NeurDB solves this by embedding AI inside the database. That means:
AI-powered queries happen directly inside the database (no need to extract data).
It’s self-driving—it learns and optimises itself without human intervention.
It protects privacy—AI models can train across multiple databases without exposing private data.

Now let’s dive into the technical innovations that make this possible.
NeurDB is built on three core innovations:
In-Database AI-Powered Analytics
Traditional databases store and retrieve data. NeurDB analyses and predicts data within the database itself.
This is achieved using:
Dynamic Model Selection – Instead of pre-selecting models, NeurDB automatically picks the best AI model based on the query.
Dynamic Model Slicing – Instead of training one general model, NeurDB adapts AI models to specific queries.
Federated Learning Integration – AI models can train without moving sensitive data, preserving privacy.
Figure 3: AI-friendly in-database ecosystem – Explains model selection, inference, and training directly inside the database. Intelligent Self-Driving Data System
Most databases require human intervention to optimise performance. NeurDB is fully autonomous.
It achieves this through AI-driven system optimisations:
Self-Tuning Query Optimiser – NeurDB uses reinforcement learning to constantly adjust database settings for speed and efficiency.
Fast-Adaptive Concurrency Control – The system learns how transactions interact and dynamically adjusts how data is locked and accessed.
Real-Time Workload Adaptation – Unlike traditional databases, NeurDB analyses usage patterns and auto-scales resources based on demand.
Figure 4: Autonomous architecture of NeurDB – Shows how NeurDB optimizes queries and manages transactions with AI.
Privacy-Preserving and Trusted AI×DB
AI models require massive amounts of data, which often includes sensitive information.
NeurDB ensures privacy and security using:
Federated Learning – AI models train on multiple databases without sharing raw data.
Secure AI Execution – Uses techniques like Zero-Knowledge Proofs (ZKPs) and Secure Multi-Party Computation (MPC) to verify AI model results.
Tamper Recovery & Fault Tolerance – If a database is compromised, NeurDB can detect tampering and restore previous data states automatically.
With AI no longer being a separate entity but an integral part of databases, we are entering an era where intelligent data processing happens seamlessly, without delays, manual tuning, or security risks. The potential applications are vast—from real-time fraud detection to hyper personalised e-commerce experiences and AI-assisted medical diagnostics.
Figure 5: Privacy-preserving & trusted AI – Visualizes federated learning and security features.
How You Can Build Something Similar at Home?
You don’t need a research lab to experiment with AI-powered databases. Here’s how you can start exploring similar techniques at home:
Pick the Right Database
Start with an AI-friendly database:
PostgreSQL – Supports AI integration using PL/Python.
DuckDB – Optimised for in-memory analytics.
Google BigQuery – Has built-in ML functions for running AI queries.
Embed AI into Your Database
Instead of running AI models separately, integrate them directly:
Use PL/Python in PostgreSQL to run machine learning models inside SQL queries.
Try BigQuery’s ML functions (ML.PREDICT()) to perform AI-powered analytics natively.
Automate Query Optimisation
Databases like PostgreSQL offer extensions for AI-driven query tuning:
pg_auto_explain – Helps analyse query performance.
AI-powered indexing tools – Optimise searches dynamically.
Implement Privacy-Preserving AI
If working with sensitive data, experiment with:
Federated Learning using PySyft.
Homomorphic encryption to process encrypted data securely.
Experiment with AI Models
Train a simple AI model inside your database:
Load your dataset in PostgreSQL.
Train a basic regression model using PL/Python.
Make predictions without extracting data from the database.
This approach gives you a hands-on feel for how AI-powered databases work—without needing enterprise-level infrastructure.
References & Attribution
To ensure proper credit and comply with fair use guidelines, cite the original research:
Reference:Ooi, B. C., Cai, S., Chen, G., Shen, Y., Tan, K-L., Wu, Y., Xiao, X., Xing, N., Yue, C., Zeng, L., Zhang, M., & Zhao, Z. (2024). NeurDB: An AI-powered Autonomous Data System. Retrieved from arXiv.
Comentarios