
Article co-written with Nathanaël FIJALKOW
CNRS Research Scientist, LaBRI
Post-doc at Oxford, Berkeley, Alan Turing Institute in London
CNRS Researcher since Jan 2018, affiliated with LaBRI, University of Bordeaux (code generation, machine learning, deep learning, large language models)
Head of the Synthesis team at LaBRI
50+ publications in international conferences and journals
Best Paper Award at AAAI 2025
Tender monitoring has long been a manual, almost artisanal activity, relying on reading official bulletins, printed newspapers, and the first institutional portals that appeared in the 2000s. But this model, still viable a decade ago, has been completely disrupted by the massive increase in the volume of available information. In France, there are now hundreds of different sources to monitor daily. Internationally, this number skyrockets: several thousand, even tens of thousands of unique sources. Each country, each region, each ministry, each utility, each public agency has its own system, its own publication logic, and its own formats.
In this fragmented and ultra-dense environment, companies must identify, filter and analyze extremely heterogeneous information very quickly. The challenge is no longer finding data: it is everywhere. The challenge is now identifying the relevant information, structuring it, and transforming it into decisions. This is where Data Intelligence, combined with artificial intelligence technologies applied to tender analysis, creates a complete breakthrough. It makes it possible to automate tasks that were previously inconceivable at scale, giving organizations instant access to structured, enriched, and immediately usable global public data.
Long before Data Intelligence emerged, public procurement monitoring relied on simple actions: flipping through a newspaper, manually checking a few websites, filing PDFs into shared folders, and perhaps building an internal spreadsheet or database. This workflow collapsed under the pressure of the explosion in public data volume. Today, an organization seeking to monitor only five major world regions — Europe, Africa, the Middle East, Asia, the Americas — is immediately confronted with an avalanche of institutional sources, regional portals, municipal bulletins, national agency websites, and a multiplicity of publishers that constantly evolve.
Each source has its own formats, rules, level of transparency, and publication rhythm. Notices may be published daily, weekly, irregularly, or even modified after the fact. PDFs may be scanned, poorly structured, or lack metadata. In such a context, manual monitoring has simply become impossible, even for organizations with substantial resources. This is not an organizational problem: it is a problem of scale.
The explosion in public data volume does not come with a proportional increase in quality or standardization. On the contrary, the more information there is, the more noise appears, and the harder it becomes to extract value. This is precisely what makes modern Data Intelligence technologies indispensable.
See our various articles on identifying tenders in France and internationally in the energy and infrastructure sectors:
– How to identify international tenders in the energy sector
– How to identify opportunities in energy infrastructures in France
- How to identify photovoltaics tenders and projects in France
– How to identify international tenders in the energy sector in Africa
– How to monitor public infrastructure tenders in Europe
– Finding consulting tenders in infrastructure and renewable energy in Africa
– Finding engineering tenders in the African energy sector
The first step in the transformation is automating the capture of public data using intelligent crawlers and scrapers. First-generation scrapers were very basic: they visited a website, retrieved its HTML, and extracted it using a few predefined rules. The problem was that these systems were extremely fragile: the slightest change in website structure broke everything. Moreover, they were unable to understand the editorial logic of a site, let alone distinguish tender notices from simple news articles.
The new generation of crawlers — designed for large-scale tender monitoring — works entirely differently. They analyze DOM structures, automatically detect patterns characteristic of tender notices, identify relevant sections, adapt their rules when the site changes, reconstruct logic even when pages are modified, and recognize structures the developer has never seen. These are dynamic systems, capable of quickly exploring thousands of sources, bypassing technical obstacles (CAPTCHAs, redirections, multilingual sites), and adapting to the extreme diversity of the global public web.
Thanks to these architectures, companies can for the first time “see” the complete ecosystem of tender publications, not only in France but also in historically complex regions such as West Africa, the Middle East or parts of Asia. Collection has become scalable; it is now industrializable.
Collecting information does not solve the problem. Companies do not only want to see tenders: they want to understand them, qualify them, prioritize them, and immediately extract key information. This is where artificial intelligence applied to tenders plays a central role.
An AI model — even a very advanced one — can only produce good results if it has access to a massive, clean, and representative dataset. In public procurement, this means having historical data containing:
– several hundred thousand tenders,
– distributed by countries, regions and contracting authorities,
– covering years of publication,
– in all existing formats,
– with a wide diversity of vocabulary, expressions and structures.
Why is this essential?
Because every country has its own way of writing a tender, its own legal formulations, its own metadata, and even its own writing habits. A model trained only on French data would be unable to correctly interpret a tender issued in Egypt, Kenya, Colombia or Vietnam.
Historical dataset quality is therefore a non-negotiable requirement for a high-performing model.
A tender is not a simple text. It is an administrative, legal, and technical object, and AI must extract three distinct layers.
These are structural elements: publication date, deadline, buyer name, execution location, procedure type, contract type (works, supplies, services).
These elements are essential to classify tenders correctly and to provide a structured base for search engines.
Note: general and administrative information is, in most cases, explicitly provided by contracting authorities and therefore directly accessible via crawling, scrapping, or — where available — platform APIs. Because these fields are “hard-coded,” extracting them usually does not require sophisticated algorithms, unlike legal or technical information, which is much more complex to interpret.
These determine whether it is feasible to bid.
They include mandatory site visits, penalties for delay, renewal terms, bonding requirements, certification obligations, and weighted evaluation criteria.
These details are often buried in annexes or difficult-to-detect paragraphs. Automatic extraction requires a model specifically trained to recognize complex legal patterns.
This is where most of the value lies.
These are the most difficult data to extract, as they are usually found in the CCTP (technical specifications), tender rules, or large technical annexes.
Example: photovoltaics
An expert model must be able to extract:
– total project power,
– number of panels,
– usable surface area,
– site type (canopy, rooftop, ground-mounted),
– presence of agrivoltaics,
– distance to the nearest grid connection point,
– land type (brownfield, polluted soil, agricultural land, etc.).
Example: charging stations
A model must extract charging power (AC, DC, ultra-fast), number of units, installation type, required standards, maintenance modalities, and site constraints.
These pieces of information can only be extracted by models specifically trained on technical data. It is not simple text recognition: it is domain interpretation.
It is also important to note that DCEs — and therefore CCTPs — are not always directly accessible. Buyers often require suppliers to register before downloading tender documents. This creates a major difficulty: some technical information, although essential, cannot be automatically retrieved and therefore cannot be analyzed.
Another underestimated challenge is the cost of processing. When calling generalist LLMs such as Gemini, Claude, Mistral, or ChatGPT to analyze documents, results may be very good, but each analysis requires a paid API call. At global scale — thousands or tens of thousands of documents per day — this becomes prohibitively expensive.
This is why the sector needs native, optimized, lightweight and economically viable models, capable of processing large document volumes without exploding costs.
This is also why this line of work cannot be improvised. It requires years of continuous R&D, often in partnership with technology companies or university laboratories specializing in AI, to create proprietary models adapted to the specific formats, legal constraints and technical diversity of tender documents.

Never before has so much public data been available. But this abundance is not a solution: it is a problem. Noise has become immense, and value is buried beneath oceans of text, scanned PDFs, inconsistent documents, and sometimes redundant publications.
Data Intelligence changes everything. It filters, normalizes, enriches, and interprets data to:
– drastically reduce noise,
– quickly identify relevant opportunities,
– understand technical and legal constraints in seconds,
– make the data searchable and actionable,
– improve commercial decision-making.
This is a strategic transformation, not only a technological one.
Deepbloo is one of the few companies capable of applying Data Intelligence and AI at this level of depth in public procurement, particularly in energy and infrastructure.
Thanks to several years of intensive global crawling, Deepbloo has built one of the largest datasets dedicated to energy/infrastructure tenders, containing hundreds of thousands of historical publications and millions of technical documents from dozens of countries.
This data depth is a unique asset for training truly high-performance models.
Deepbloo has developed its technology in close collaboration with leading academic institutions:
– LaBRI (Bordeaux Computer Science Laboratory), one of the largest computer science labs in France.
– Researcher Nathanaël Fijalkow, specialist in AI, statistical learning, and complex-data modeling.
– The Institute of Data Science of Montpellier (University of Montpellier), with which a partnership was launched in 2025 to develop a new generation of automatic-analysis algorithms for tender documents.
These collaborations make it possible to integrate into our models advanced approaches in text structuring, automatic recognition of complex entities and technical classification.
Deepbloo’s technological pipelines make possible the transformation of raw data into operational intelligence
– to automatically capture tenders from thousands of sources,
– to analyze documents with specialized models,
– to extract technical, legal and administrative information,
– to structure data into a standardized format,
– to make it instantly searchable through an expert semantic engine.
For companies, this represents a paradigm shift. They move from slow, incomplete and reactive monitoring to an automated, exhaustive, intelligent system capable of instantly detecting relevant tenders and extracting key information.
Deepbloo was also invited to present its AI approach — based on agile small models specifically trained for tender analysis — at the Dataquitaine event. Here is the video of that presentation.
Request a demo here.
Tender monitoring has become a critical challenge in a world where public data is multiplying and international competition is intensifying. Manual methods can no longer keep up with the scale, diversity, and complexity of publications. Only approaches based on Data Intelligence, specialized AI models, and deep sector expertise now make it possible:
– to cover a global perimeter,
– to extract relevant signals from noise,
– to automatically analyze complex documents,
– to capture essential technical and legal information,
– and to identify opportunities at the right moment.
Thanks to its technical capabilities, data depth and scientific partnerships, Deepbloo is one of the most advanced players in this transformation. The company demonstrates that AI applied to tender monitoring is no longer futuristic: it is an operational reality already improving the commercial performance of many organizations in France and across the world.