Portfolio Projects

Explore our comprehensive portfolio of projects that have been successfully delivered, and discover the diverse use cases where we have generated substantial value to our clients.

AI Platform Solutions

Projects

Cost-Efficient LLM Operation

Our solution is a robust, cost-efficient, and scalable technology stack for managing LLM operations, addressing the unique challenges of deploying and maintaining large language models. The infrastructure combines Kubernetes-managed hybrid GPU and CPU clusters, lightweight distributions like K3s, and Ray compute engine integration for distributed processing. NVIDIA CUDA and Hugging Face Transformers streamline GPU resource handling and model management, while vLLM enhances inference performance through optimized memory and batching.

Observability is achieved through a comprehensive framework utilizing Prometheus, Grafana, Fluent-bit, Loki, and DeepFlow for metrics, logs, and distributed tracing, enabling precise monitoring and anomaly detection. Benchmarking validates the system’s high throughput and adaptive capabilities under dynamic loads. Advanced features in development include proactive anomaly detection and self-stabilizing mechanisms for load balancing. Future expansions focus on multi-LLM deployments, edge computing, and GPU optimization, laying a foundation for scalable AI-driven applications.

Read More »

An AI Platform for Structured Infrastructure Access

Our AI platform simplifies GPU infrastructure management using Kubernetes and Ray Train for distributed training and scalable AI workloads. It supports deployment on custom GPU clusters or cloud solutions, adapting to diverse infrastructure needs. Automated installation ensures quick setup, while Prometheus and Grafana provide real-time insights into resource use, training progress, and system health. Advanced orchestration and scheduling optimize GPU utilization and workload distribution across environments. MLOps integration ensures reproducibility, traceability, and streamlined workflows with automated version control, containerization, and artifact management. This solution addresses GPU infrastructure challenges, delivering scalability, cost-efficiency, and reliability for evolving AI workloads.

Read More »

Manufacturing

Projects

Failure Prediction and Predictive Maintenance for Heavy Rotating Machines

Utilizing IoT sensors mounted on industrial machines, our project captures critical telemetry data such as vibrations, rotational speed, and temperature. This data is transmitted to an analytics server for AI-driven analysis, aiming to assess machine health and predict potential failures. Our AI models effectively forecast machine failures 20 to 42 hours in advance, enabling timely corrective actions that prevent downtime. Deployed across over 2000 machines, our solution has significantly cut operational costs and achieved a return on investment in just 11 months, enhancing maintenance efficiency and reliability.

Read More »

Reduce Defektive Parts in Manufacturing Production Lines

This project outlines the integration of an AI solution with an automated solder paste printing production line for industrial circuit boards. The AI system accesses machine configuration settings and quality control data to correlate configurations with quality outcomes. It identifies optimal settings to reduce defective part production. Implementation of this AI solution led to at least a 20% reduction in defective parts for our customer, optimizing component-specific configurations on the production line.

Read More »

Public Sector

Projects

Automating Information Extraction with Named Entity Recognition and LLMs

Automating Information Extraction with Named Entity Recognition and LLMs Our project aimed to automate the process of extracting information from various data sources, including textual documents and database tables, utilizing Named Entity Recognition (NER) alongside advanced Language Models (LLMs). Starting Point Recognizing the time-consuming nature of manually extracting information from

Read More »

LLM Assistant

LLM assistant for everyday tasks Trustworthy, efficient, and secure LLMs deployment and finetuning to automate, speed-up, and improve day-to-day assignments Starting Point Identify a set of everyday tasks (organizing, reporting, translating, transcribing, responding, etc.) to be fully automated and accelerated. Objective Harness the power of (open-source) LLMs to automate the

Read More »

QueryCortex

Trustworthy LLMs with QueryCortex Providing LLM-based search and analysis of confidential information using the access rights from the enterprise identity management system Starting Point Information on selected / deployed LLM version in the enterprise. Information about the interface of the local IDM as well as access to stored identities, Objective

Read More »

IT Operation

Projects

Integrating Sustainability in the Software and Cloud Lifecycle

Integrating Sustainability in the Software and Cloud Lifecycle Empowering Developers with Innovative Tools for Optimizing Energy Efficiency and Minimizing Carbon Footprint in Cloud Computing Starting Point Openstack based cloud infrastructure and cloud services Objective Create a toolbox for measuring, analyzing, and optimizing energy use in cloud services and infrastructures, offering

Read More »

Predictive Observability for DIMMs and SSDs

Predictive Observability for DIMMs and SSDs Delivering early warnings to system administrators to migrate data and running system components away from soon-to-fail SSDs or computing nodes. Starting Point Telemetry data (SMART format) from around 1500 SSD and DIMM devices running in large scale cloud data center. Objective Predict individual device

Read More »

Root Cause Analysis on Android Log Data

Root Cause Analysis on Android Log Data for Failure Investigation Root cause analysis of Android log data enables the development team to identify underlying issues, fostering stable service creation. By comprehensively understanding failures through this analysis, the team can effectively mitigate them in subsequent updates, enhancing the reliability and performance

Read More »

Anomaly Detection BitFlow

Predictive IT Fault Tolerance Analyzing metric data and logs to predict upcoming infrastructure failures and migrate containers / VMs to safe resources to guarnatee 24/7 availability Starting Point Observability data from OpenStack monitoring and logging components incoming from a large productive cloud infrastructure Objective Analyze the incoming data to detect

Read More »

Contact us

You have a use case in mind or want to explore how AI can add value to your business? Send us a message.