Indisziplinäres Promovierendenkolloquium 2025 - Anmeldung für externe Partner

Name: Indisziplinäres Promovierendenkolloquium 2025 - Anmeldung für externe Partner
Start: 2025-07-17T15:15:00+02:00
End: 2025-07-17T17:45:00+02:00
Location: Hochschule Offenburg

17. Juli 2025

Hochschule Offenburg

Europe/Berlin Zeitzone

Contact the conference organisers:

Presentation of a Visual RAG Pipeline applied on Product Advertisements

17.07.2025, 16:45

15m

zoom (Hochschule Offenburg)

zoom

Hochschule Offenburg

Die Zoom-Meeting-Details für jedes Panel sind im Book of Abstracts aufgeführt. ** The Zoom meeting details for each panel are listed in the Book of Abstracts.

Short presentation Panel Smart Digitalisation

Bianca Lamm (ISIn)

Vision Language Models (VLMs) represent a major advancement in multi-modal Artificial Intelligence, combining visual and textual data processing. However, VLMs have mainly a knowledge about public available data. The Retrieval Augmented Generation (RAG) approach enhances access to external information.
In this talk, a Visual RAG Pipeline that merges the RAG approach with VLMs is presented. The pipeline involves five main steps: Preprocessing, Vector Store, Retrieval, Classification and Relational Query, Prompt Generation, and Completion. A custom dataset has been utilized for the evaluation of the pipeline. This dataset comprises image data depicting product advertisements as presented in leaflets, along with corresponding product and promotion information pertaining to the advertisements. Promotion data includes aspects such as price, regular price, and discounts, while product data covers attributes like brand, weight, and Global Trade Item Numbers (GTINs), with the GTIN serving as a standardized and unique identifier for products.
In the retail and supply chain domain, data related to GTINs are crucial for reporting and analysis. Given the constantly changing range of traded products, many of which are often highly similar, the Fine-Grained Classification (FGC) of these products is essential for effective analysis.
The task of FGC has been explored using the Visual RAG Pipeline. The comparison of various VLM back-ends, including GPT-4o, GPT-4o-mini, and Gemini 2.0 Flash, utilized within this pipeline, has yielded an accuracy rate of 86.8%.

Bianca Lamm (ISIn)

Es gibt derzeit keine Materialien.

Indisziplinäres Promovierendenkolloquium 2025 - Anmeldung für externe Partner

Contact the conference organisers:

Presentation of a Visual RAG Pipeline applied on Product Advertisements

zoom

Hochschule Offenburg

Sprecher

Beschreibung

Hauptautor

Präsentationsmaterialien