OpenVINO on AWS – Exploring Deep Learning Model Development and Deployment with OpenVINO

OpenVINO on AWS – Exploring Deep Learning Model Development and Deployment with OpenVINO

June 8, 2024

OpenVINO in Practice

Open Model Zoo

Many pre-trained deep learning models that are tailored for Intel hardware may be found in the Open Model Zoo. With the help of the model zoo, developers may create intelligent applications more quickly thanks to its centralized model repository. Pre-trained models are available at the zoo for a range of computer vision applications, including object identification, image segmentation, and image classification.

Age, Gender and Emotion Detection Demo

A representative sample of the Open Model Zoo’s capabilities is the Age, Gender, and Emotion Detection demo. The demo demonstrates the adaptability and sophistication of OpenVINO’s pre-trained models by using deep learning to extract not just age and gender of a person from the input (photo or video) but also subtle emotional clues.

With the demo’s C++ inference code, it can be quickly and easily incorporated into their existing workflow/applications or modified to perform additional tasks along with inference. With this simplified integration method, developers can add sophisticated facial analysis features to their applications, improving user experiences.

Developing Applications around OpenVINO

Solutions that perfectly meet individual needs are becoming more and more in demand in today’s fast-paced application development environment. Making sure that deep learning models are easily accessible through an intuitive online interface and that the inference results can be used to generate actionable insights is one way to satisfy this desire. Developers can design solutions that not only meet needs but also promote efficiency and innovation by combining different technologies with customized components in a seamless manner. 

Using the ease of a web interface, we’ll be exploring the development of an application centered on Open Model Zoo’s Age, Gender and Emotion Detection Demo with the goal of improving user accessibility and making actionable insights possible. We’ll discover the importance of each component in the scope of development, and how they integrate with each other.

Frontend Development with NextJS

Emerging as a powerful full-stack framework, NextJS provides developers with a stable platform that enables them to create user interfaces with unmatched speed and functionality. Because of its many features, which improve the end-user experience and speed up the development process, such as server-side rendering and automatic code splitting, it has earned an excellent reputation. Developers can create dynamic, responsive websites using NextJS that conform to a variety of screen sizes and browser settings with ease. The flexibility of the framework also includes support for server-side rendering, which facilitates quick loads and improved SEO results, all of which improve discoverability and user interaction.

For the purposes of this project, NextJS was used as the forefront of user interaction with the model, allowing users to access inference via the click of a button.

API Development with FastAPI

FastAPI is an excellent choice for API development since it provides developers with a sophisticated and efficient framework for building Python APIs. Because of its simple syntax and outstanding performance in handling HTTP requests and responses, FastAPI is an ideal means of enabling communication between frontend and backend components. Its asynchronous programming features, which provide optimal performance even in the face of heavy traffic, enable developers to efficiently handle several requests at once. Deep support for data validation and serialization is provided by FastAPI to help speed up the process of handling incoming requests and generating relevant responses.

For the purposes of this project, FastAPI played a role in facilitating seamless communication between the frontend and the backend. The defined POST endpoint would accept an existing RTSP stream URL or a video file. This input would be passed onto the OpenVINO model to perform inference in real-time and stream the output. The output stream URL given by the OpenVINO model would be passed on the frontend as the response.


Real-time Inference with OpenVINO

When FastAPI receives a request, it seamlessly launches the OpenVINO model and initiates a sequence of actions to extract valuable insights in the context of real-time inference. Additional functionality was added to the C++ inference code, going above and beyond the standard features like age, gender, and mood recognition. Inference records were easily inserted into a PostgreSQL database for later analysis, via the Database Manager module that is included. Furthermore, the model’s capabilities were extended by adding new streaming features using MediaMTX and an OpenCV-GStreamer pipeline.

Once the OpenVINO model is running, it works quickly to extract useful insights from each frame of the incoming data. Meanwhile, by organizing and logging inference entries into the PostgreSQL database for future reference, the Database Manager makes sure everything runs smoothly. In the meantime, the model manages the streaming of its output as an RTSP stream, which is an essential step in guaranteeing that users can view the outcomes instantly. MediaMTX is responsible for converting this RTSP stream into HLS format, which facilitates seamless web streaming on NextJS by utilizing libraries like VideoJS.


Streaming with OpenCV-GStreamer Pipeline

The processed inference stream runs through a dedicated OpenCV-GStreamer pipeline. A flexible multimedia streaming infrastructure is provided by GStreamer, while powerful image processing and manipulation capabilities are offered by the versatile computer vision library OpenCV. It should be noted that for GStreamer and OpenCV to work together, OpenCV should be built from source with GStreamer backend enabled. 

The media transformation process uses an RTSP stream that is sent to a dedicated server run by MediaMTX. RTSP broadcasts are optimized for smooth online consumption by MediaMTX, who specializes in turning them into HTTP Live Streaming (HLS) format. By the use of its transcoding methods, MediaMTX guarantees cross-platform viewing consistency across a wide array of devices and browsers.


Database Management with PostgreSQL

In addition to real-time streaming and inference, the OpenVINO pipeline maintains a dedicated PostgreSQL database in which critical information about discovered persons is kept. Information may be easily stored, retrieved, and managed with PostgreSQL, a potent tool for data management. Developers guarantee data integrity and accessibility for additional analysis and visualization by classifying information such as age, gender, and emotion attributes in the database.

Every stream processed by the OpenVINO pipeline is given a unique JobID, which facilitates access to the inference results in the database. This distinct ID functions as a key, enabling developers to locate and examine data pertaining to specific streams with ease.


Data Analysis and Visualization

NextJS effectively gathers and analyzes stored data collected from the PostgreSQL database by using Prisma ORM, a database toolkit designed for Node.js and TypeScript. Prisma ORM enables developers to do complex data manipulations with ease by providing a schema design language and a type-safe query builder that simplify database interactions. NextJS uses Prisma to efficiently extract relevant data from the database to allow data analysis and visualization in a friendly web interface.

Next up, ChartJS is used on the frontend to transform the raw data into insightful visuals and provide users with crucial insights into the conclusions drawn from the OpenVINO model. NextJS generates visually stimulating charts that display the counts of emotions, genders, and ages within specific time intervals, giving users meaningful insights based on the model’s inference. These visualizations offer a clear understanding of the underlying trends and patterns in the data, which supports well-informed decision-making.


Deployment Strategy

In order to achieve maximum performance and scalability, the system is deployed on AWS by coordinating the distribution of numerous components among several servers. This configuration places the web application on one Linux server, the PostgreSQL database, and the FastAPI API on another, together with the OpenVINO model. Since each server may be optimized for a particular job within the application architecture, this separation of responsibilities makes resource allocation and maintenance more efficient.

We SSH into the Linux server on the machine that houses the models and API to configure the environment. Relevant OpenVINO model and FastAPI code is pulled from BitBucket. C++ code is compiled into an executable for quick and easy use. Centralized data storage with PostgreSQL is also set up on this server. Nginx is used by both, frontend and backend servers to carry out load balancing. Meanwhile, a similar procedure takes place on the server hosting the web application, where code is pulled from Bitbucket, and the environment is configured. For the web application, “npm run build” is used to build an optimized production build before hosting it. In case of increased traffic, users can have seamless experiences since the frontend and backend services can scale and update independently.



In brief, the collaborative use of OpenVINO with NextJS, ChartJS, Prisma ORM, FastAPI, GStreamer, MediaMTX, and AWS demonstrates an integrated strategy for constructing and deploying deep learning models. Because of its powerful optimization capabilities, OpenVINO smoothly interacts with AWS infrastructure, providing developers with a scalable and effective platform for AI-powered apps. NextJS, ChartJS, and Prisma ORM also contribute to the frontend development and data analysis components by providing user-friendly tools for showing and interpreting inference findings. FastAPI facilitates communication between components, whilst GStreamer and MediaMTX provide real-time streaming and data conversion for web consumption.

This useful combination of the aforementioned components highlights the importance of industry-standard tools and their compatibility in producing comprehensive solutions. With OpenVINO at its core, the possibilities for AI-driven applications are immense, backed up by a rich ecosystem of complementary tools and services. Businesses can harness such applications to gain insightful information and carry out data-driven decisions.


Intrigued by the possibilities of AI? Let’s chat! We’d love to answer your questions and show you how AI can transform your industry. Contact Us