May 29, 2024
The Evolution of AI: From Theory to AI Engineering
Yucheng Low
It wasn’t too long ago when Machine Learning (ML) used to be mostly about theory and math. Designing mathematical models to represent and explain the problem; be it object detection, or solving Go.
The insight that “data volume” is more important than “model sophistication” is the key to improving performance has been known for a long time, but it is always surprising when it happens. And every time it happens, it completely revolutionizes a field.
In the last few years, the quality of LLM models has improved dramatically and new applications continue to surprise. AI adoption has also taken a turn and rapidly accelerated. Especially since the ChatGPT breakthrough. AI products are now being built every day, and impacting hundreds of millions of consumers and businesses around the globe. And we are just at the beginning of the "AI Epoch".
As presented at ODSC East 2024, this means we need a new approach to AI development. More than ever before, we must think about AI/ML as an engineering discipline, not just a scientific discipline anymore.
The Shift Towards AI Engineering
At XetHub, we envision AI engineering as a multidisciplinary approach that combines principles from software engineering, computer science, and human-centered design to create AI systems for real-world needs and business outcomes. This means that to set your AI projects up for success, there are four best practices to keep in mind.
#1: Clear measurable objectives
Having clear objectives is key: Why do you want AI in your product? What does it improve? How do you measure that improvement?
And those objectives need to be measurable. What is the business metric you actually care about? Time savings? Engagement? Click-through-rate? If you can’t measure it, you can’t improve it!
For example, when choosing a foundation model, start from the requirements that are specific to your product: cost, latency, QPS (Queries Per Second), memory, etc. Bigger/newer is not always better. Instead, consider what are the accuracy needs that meet your objectives. It is generally better to start with the smallest model and work your way up to meet your requirements.
#2: AI can be wrong
Be ready for when AI makes mistakes. Observability is key to help debug an AI application. Model visualizations as well as versioning, lineage, and provenance across all of the AI project dependencies is absolutely crucial for explainability purposes.
Assuming AI can be wrong, make sure to set correct expectations when designing your application. For example, consider how the output is presented and do not design UX which allows people to believe that the AI is an expert. Instead, be transparent and think about how you can leverage human-in-the-loop feedback to improve the system.
#3: Software engineering best practices still apply
Just like in classical software engineering, there are best practices that should be followed when building AI applications. Typically, a software engineering application will involve code reviews, continuous integration and testing and reproducible artifacts. The unique challenge with AI applications is that you'll need to drag along a massive amount of data as well.
Collaboration
Collaboration is key to the success of any software development project, and this is especially true for AI applications. In AI, you should have clear knowledge transfer and sharing processes in place to ensure that everyone on the team is on the same page. You also need the ability to visualize and understand changes as the project progresses and models evolve (architecture, data sets, accuracy metrics, etc…).
Documenting code, data, and processes will help new team members to get up to speed quickly and to contribute effectively.
Reproducibility
In AI, reproducibility is just as important as it is in classical software engineering. You should always be able to go back in time and re-create any artifact, predictions, models, or embeddings.
Make things deterministic to ensure that you can reproduce the same results consistently. This will help you to identify and fix any issues that may arise and to improve the overall performance of your AI application.
Testability
Testing is an essential part of the software development process, and it is no different for AI applications. In AI, you should know the impacts of changes and how they affect the metrics of interest. This is similar to the concept of continuous integration and continuous deployment (CI/CD) in software engineering, where you continuously test and deploy the code to ensure that it is working as expected.
Monitoring tests and metrics over time will help you to identify any trends or patterns and to make informed decisions about how to improve your application.
#4: Be flexible and reduce complexity
Last but not least, think about how to minimize complexity in your AI application development. This is particularly relevant in the fast-changing world of AI, where new tools and technologies are constantly emerging.
For example, current models have a limited context and fine-tuning can be expensive, so techniques like RAG (Retrieval Augmented Generation) have been developed to patch these gaps. Similarly, agents are used to fill in for the fact that current models cannot plan well and that there is not a good way to build model ensembles.
However, these are likely temporary measures that will be replaced by more advanced techniques and technologies as the field continues to evolve, just like vector databases commonly used in RAG systems are probably unnecessary.
Conclusion
The evolution of AI from a theoretical discipline to an engineering-focused domain has opened up a world of possibilities for businesses and organizations across all industries.
With the right approach, AI can be a total innovation booster for your organization. But remember: AI is still in its infancy. AI methods today may not be there tomorrow but good engineering practices are forever.
Sign up for a free trial today and see how XetHub can help you bring modern development practices to your AI projects - no matter the size and amount of files, data, models, and other dependencies.
Share on