Managing industrial data: prevention is better than cure

Managing industrial data: prevention is better than cure

In the field of health, it is known that is more effective prevent illnesses than treat them once they have manifested themselves. In a similar way, it can be apply in the context of industrial data, its continuous and proactive maintenance helps to avoid the need of an extensive pre-treatment before using advance data analytic techniques for decision-making and knowledge generation.

Pre-treatment data implies doing several tasks as: (1) data cleaning, (2) correction of errors, (3) elimination of atypical values and (4) the standardisation of formats, among others. These activities are necessary to assure quality and data consistency before using it in analysis, decision-making or specific applications.

Fuente: Storyset en FreePik

However, if robust data maintenance can be implemented from the outset, many of these errors and irregularities can be prevent. By establishing proper data entry processes, applying validations and quality checks, and keeping up-to-date records, it is possible to reduce the amount of pre-treatment need later, identifying and addressing potential problems before they become major obstacles. This includes early detection of errors such as inaccurate data, correction of inconsistencies and updating of outdated information. It is true that companies currently store large amounts of data but it is important to highlight that not all of this data is necessarily valid or useful, for example, for use in an artificial intelligence project. Indeed, many organisations face the challenge of mantaining and managing data that lacks relevance or quality. This management aims to ensure te integrity, quality and availability of data over time.

Efficient data maintenance is crucial to ensure that data are relaible, up-to-date and accurate, but this involves continuous monitoring and management by company staff, ensuring that they remain accurate, consistent, complete and up to date. The most common activities related to data maintenance include:

  1. Regular monitoring: Is carried out a periodic data tracking to detect possible problems, such as errors, inconsistencies, loses or atypical values. This can involves the revision of reports, tendance analysis or the implementation of authomatized alerts to detect anomalies.
  2. Updating and correction: If errors or inconsistencies in data are identified, maintenance staff will ensure that theyr are corrected and updated appropriately. This may involve reviewing records, checking external sources or communicating with those responsible for data collection.
  3. Backup and recovery: Procedures and systems are established to back up data and ensure its recovery in the event of failure or loss. This may include implementing regular backup policies and conducting periodic data recovery tests.
  4. Access management and security: Data maintenance staff ensure that data is protected and only accessible by authorised users. This may involve implementing security measures such as access control, data encryption or monitoring audit trails.
  5. Documentation and metadata update: Dara-related documentation, including field descriptions, database structure and associated metadat, is kept up to date. This facilitate the understanding and proper use of the data by users.

In summary, data maintenance involves: (1) regularly monitoring, (2) correcting errors, (3) backing up, and (4) securing the data to ensure that it is in good condition and reliable. These actions are fundamental to mantaining the quality and security of stored information.

At CARTIF, we face this type of problems in different projects related to the optimisation of manufacturing processes for different companies and industries. We are aware of the amount of time consumed in staff hours due to the problems explained, so we are working on providing certain automatic mechanisms that make life easier for those responsible for the aforementioned “data maintenance”. One example is s-X-AIPI project focused on the development of AI solutions with auto capabilities that require special attention to data quality starting with data ingestion.


Mireya de Diego. Researcher at de Industrial and Digital Systems Division

Aníbal Reñones. Head of Unit Industry 4.0 at the Industrial and Digital Systems Division

Terahertz technologies in industry

Terahertz technologies in industry

In this post, I would like to talk about devices capable of acquiring images in the Terahertz spectral range, an emerging technology with great potential for implementation in industry, especially in the agri-food sector.

Currrently, machine vision systems used in industry work with different ranges of the electromagnetic spectrum, such as visible light, infrared, ultraviolet, among others, which are not able to pass through matter. Therefore, these technologies can only examine the surface characterisitcs of a product or packaging, but cannot provide information from the inside.

In contrast, there are other technologies that do allow us to examine certain properties inside matter, such as metal detectors, magnetic resonance imaging, ultrasound and X-rays. Metal detectors are only capable of detecting the presence of metals. Magnetic resonance equipment is expensive and large, mainly used in medicine, and its integration at industrial level is practically unfeasible. Ultrasound equipment requires contact, requires some skill in its application and is difficult to interpret, so it is not feasible in the industrial sector. Finally, X-rays are a very dangerous ionising radiation, which implies a great effort in protective coatings and an exhaustive control of the radiation dose. Although they can pass through matter, X-rays can only provide information about the different parts of a product that absorb radiation in this range of the electromagnetic spectrum.

Technologies to examine properties inside matter

From this point of view, we are faced with a very important challenge, to investigate the potential of new technologies with the capacity to inspect, safely and without contact, the inside of products and packaging, obtaining relevant information on the internal characteristics, such as quality, condition, presence or absence of elements inside, homogeneity,etc.

Looking at the options, the solution may lie in promoting the integration in industry of new technologies that work in non-ionising spectral ranges with the ability to penetrate matter, such as the terahertz/near-microwave spectral range.

First radiological image. Röntgen´s wife´s hand
First radiological image in histroy. The hand of Röntgen´s wife

In 1985, Professor Röntgen took the first radiological image in history, his wife´s hand. 127 years have passed and research is still going on. In 1995, the first image in the Terhaertz range was captures, son only 27 years have passed since then. This shows the degree of maturity of Terahertz technology, still in its early stages of research. This radiation is not new, we know it is there, but today it is very difficult to generate and detect it. The main research work has focused on improving the way this radiation is emitted and captured in a coherent way, using equipment developed in the laboratory.

In recent years things have changed, new optical sensors and new terahertz sources with a very high industrialisation capcity have been obtained, which opens the doors of industry to this technology. Now there is still a very important task of research to see the scope of this technology in the different areas of industry.

CARTIF is committed to this technology and is currently working on the development of the industrial research project AGROVIS, “Intelligent VISual Computing for products/processes in the AGRI-food sector“, a project funded by the Junta de Castilla y León, framed in the field of computer vision (digital enabler of industry 4.0) associated with the agri-food sector, where one of the main objectives is to explore the different possibilities for automatically inspecting the interior of agri-food products safely.

Elicitation of knowledge: learning of expert operators

Elicitation of knowledge: learning of expert operators

Elicitation (from the latin elicitus “induced” and elicere “to catch”) is a term associated with psychology that refers to the fluid transfer of information from one human being to another by means of language.

The knowledge elicitation applied to industry is a process by which valuable information and knowledge is collected and recorded from experts or people with experience in a particular area in the organization. Is a technique used to identify, extract and document the tacit knowledge (implicit) that is in the mind of the individuals or in the organizational processes. It is a way to collect and record the existing knowledge not available in formal documentation and is used in different fields such as knowledge- management, engineering, business, among others. The knowledge elicitation could be use inside the engineering field to optimize industrial processes, create expert systems, for apps based in AI, etc.

For example, if it were technologically possible to access the minds of workers as in the fictional series Severance, where a sinister biotech corporation, Lumon Industries, uses a medical procedure to separate work and non-work memories, this knowledge could be recorded and available for use, but it is also clear that this premise would raise significant ethical and legal concerns at this point in history, we do not know in the near future.

The knowledge elicitation is important for different reasons. In first place, allows organizations to document the existent knowledge of their employees and experts in an specific area.This can help to avoid re-invention of the wheel and improve efficiency in decision-making. Secondly, knowledge elicitation can also help to identify gaps in an organisation’s knowledge, enabling them to take action in advance. Thirdly, this elicitation process can help foster collaboration and knowledge sharing among an organisation’s employees.

The aim of elicitation is to obtain accurate and relevant information to aid decision-making, improve efficiency and support training and development. This information is used to develop optimal rules for expert performance that serve as the main input for the controls that can be programmed into a production process.

Knowledge elicitation is important for several reasons. Firstly (1), it allows organisations to document the existing knowledge of experts in a specific area. This can help to avoid re-invention of the wheel and improve efficiency in decision-making. Secondly (2), knowledge elicitation can also help to identify gaps in an organisation’s knowledge, allowing organisations to take action in advance. Thirdly (3), this elicitation process can help foster collaboration and knowledge sharing among an organisation’s employees.

The methodology for knowledge elicitation requires a series of steps to be followed:

  1. Requirements analysis: identifying the approach to knowledge-based systems.
  2. Conceptual modelling: creating a base of terminology used, defining interrelationships and constraints.
  3. Construction of a knowledge base: rules, facts, cases or constraints.
  4. Operation and validation: Operating using automated reasoning mechanisms.
  5. Return to requirements analysis if necessary or continue with the process.
  6. Enhancement and maintenance: Expanding knowledge as the system evolves, repeat throughout the life of the system.

Subsequently, it is necessary to analyse the knowledge collected, to determine which information is relevant and which is not, by distinguishing and separating the parts of a whole until its principles or elements are known, the result of which is high quality knowledge. The verification or detection of defects of the requirements previously analysed, normally by means of techniques such as formal reviews, checklists, etc.

The following elements are necessary for the correct development of the tendering process:


The different experts on the procces can have different point of views of a same theme, due to their experience, knowledge and even more subjective aspects such as mentality, way of focus difficulties, challenges, etc. Should be considered experts specialists in different stages, different infrastructures, equipment, products,etc.

The barriers that can appear in this type of exchange of information is that often contain complex ideas and associations, hard to comunicate in an easy way, with detail and organization, the use of a same language, such as concepts or specific vocabulary.

The knowledge elicitation has an objective search, research and help users or experts in the productive process in this case, to document their own needs by an on-site or online interview, group meetings, in situ studies, etc.


To acquire expert knowledge the best technique is carrying out a number of personal interviews, some of the disadvantages are; distance, time and people involved on this process, the paper or online questionnaires can be viable option that saves time and costs and it is made easier for all sections to be present, enabling the comparative and evaluation of the results.

The characteristics for a good questionnaire design: define the relevant information, good structuring with different sections organized by themes, organizes points from general to more detailed in each section, focusing on the idea of those section,it is avoid the introduction of tendencies, misunderstandings or mistakes, to realize the design with an expert of the domain to ensure that points are enough understandable to facilitate the answer.


The expected results are the actions to make by the operators when parameters deviations are produced, those answers and information collected are transform intop optimal needed rules to program authomatic controls about the process, and whre this rules are the main element. The obtention of rules is not an easy task, an iterative and heuristic process in several phases is recommended. For the validation it is necessary the comparative of the collected information at the databases with the answers of the operator to verify the actions when parameters deviations of the desired values are produced.

This optimal rules or also denominated if-then rules are part of the knowledge base, in particular of the relations base, that is the part of an expert system that contains the knowledge about the domain. In first place, the knowledge of the expert is obtained and it is codified in the relations base.

Finally, it is when fuzzy logic can be used for the design and implementation of an expert system, which is the logic that uses expressions that are neither totally true nor false, allowing to deal with imprecise information such as average height or low temperature, in terms of so-called “fuzzy” sets that are combined in rules to define actions: e.g. “if the temperature is high then cool down a lot”. This type of logic is necessary if one wants to better approximate the way of thinking of an expert, whose reasoning is not based on true and false values typical of classical logic, but requires extensive handling of ambiguities and uncertainties typical of human psychology.

Currently in CARTIF the expert elicitation knowledge of the plant operators are been used at the INTELIFER project, which main objective is the optimization of the process and of the products of a manufacturing line of NPK granulated fertilisers with support of the artificial intelligence.

The operation of these type of granulated fertilisers plants is controlled manualli and heuristically by expert operators, but that, despite of its skills and habilities, they can not avoid the high rates of recycle, frequent inestabilities and non-desired stops, as well as the limite quality of the products. Due to the extremely complex nature of the granulated process, which includes multistages, multiproduct, multivariables, is not lineal, coupled, stochastic. So that the situation before exposed has meant the scientific base for the defiition of the present project, being necessary the development of R&D activities in which, by the application of the artificial intelligence philosophy joint with a higher degree of sensorization and digitalization, is achieved to optimize this type of manufacturing processes.

Artificial Intelligence, an intelligence that needs non-artificial data

Artificial Intelligence, an intelligence that needs non-artificial data

The common denominator of artificial intelligence is the need of available, good qualilty and real data to advance in the different procedures needed to create and train the models. Practical research in AI often lacks available and reliable datasets so the practitioners can try different AI algortihms to solve different problems.

In some industrial research fields like predictive maintenance is particularly challenging in this aspect as many researchers do not have access to full-size industrial equipment or there are not available datasets representing a rich information content in different evolutions of faults that can happen to an asset or machine. In addition to that, the available datasets are clearly unbalances as the norm for machines is that they operate properly and only few examples of faults appear during their lifetime.

It´s very important from the AI research point of view the availability of reliable and interesting data sources that can provide a variety of examples to test different signal processing algorithms and introduce students and researchers into practical application such as signal processing, classification or prediction.

The ideal situation for researchers and developers of artificial intelligence solutions is that everyone, to a certain extent, shares data. But sharing data cannot be seen only as a way to help other people, sharing research data can bring many advantages to the data donor:

  • It´s part of good data practice and open science as it is encouraged to make data accesible together with the scientific papers generated.
  • Cut down on academic fraud and prevent publications of studies based on fake data.
  • Validate results. Anyone can make a mistake, if we share the data we used, other researchers could replicate our work and detect any potential error in our work.
  • More sicentific breakthroughs. This is especially true in social and health science where data sharing would enable for example more studies in human brain as Alzheimer´s Disease and many others.
  • A citation advantage. Studies that make data available in a public repository are more likely to receive more citations than similar studies for which the data is not made available.
  • Best teaching tools based on real cases.

At Europe level the European Commission has launched the Open Research Europe, a scientific publishing service, for Horizon 2020 and Horizon Europe beneficiaries with a service to publish their results in full compliance with commission open access  policies.  The  service  provides  an  easy,  high  quality  peer-reviewed  venue  to  publish  their  results  in  open  access,  at  no  cost  to  them.  Other interesting service part of this open research initiative is Zenodo, an open repository to upload your research results. In  addition  to  the  open research publishing  guidelines,  data  guidelines  are also available which adheres the F.A.I.R principles too and refers a number of trusted repositories like Zenodo, that we are obliged to use based on the European project rules.

The FAIR guiding principles for publishing data mean that the data and its meta-data that defines it must be:

  • Findable:  (meta)data are assigned a globally unique and persistent identifier.
  • Accessible: (meta)data are retrievable by their identifier using a standardized communications protocol.
  • Interoperable: (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
  • Reusable: meta(data) are richly described with a plurality of accurate and relevant attributes.

Besides, from the governmental point of view European Commission, both European Data Strategy and Data Governance policy are powerful initiatives focus on the implementation of European data spaces, among which the Commission proposes the creation of a specific European industrial (manufacturing) data space to take advantages of the strong European industrial base and improve their competitiveness.

As researchers in CARTIF, we are committed to promote such openness with our research projects. For example, for CAPRI project it has been recently created its own Zenodo channel repository, where we periodically upload project results of the advanced solutions we are developing for the process industry such as cognitive sensors or cognitive control algorithms. You can go to the repository and inspect more than 40 datsasets, sourcecode or videos we already have uploaded.

Robust solutions with simple ideas

Robust solutions with simple ideas

Machine vision is one of the enablers of Industry 4.0 with increased integration in production lines, especially in the quality control of products and processes. In recent years, a real revolution is taking place in this field with the integration of Artificial Intelligence in image processing, with a potential yet to be discovered. Despite the limitations of Artificial Intelligence in terms of reliability, results are being obtained in industry that were previously unthinkable using traditional machine vision.

The purpose of this post is not to talk about the possibilities of Artificial Intelligence, as there are many blogs that deal with this task, the purpose is to highlight the potential of traditional machine vision when you have experience and develop good ideas.

Machine vision is not just a set of algorithms that are applied directly to images obtained by high-performance cameras. When we develop a machine vision system, we do so to detect a variety of defects or product characteristics. Our task is to select the most appropriate technology and generate the optimal conditions in the scene in order to extract the required information from the physical world from the captured images. There are many variables to consider in this task: the characteristics of the lighting used in the scene; the relative position between the acquisition equipment, the lighting system and the object to be analysed; the characteristics of the inspection area; the configuration and sensitivity of the acquisition systems, etc.

This knowledge can only be acquired from experience and we can highlight that CARTIF has been providing this type of solutions to the industry for more than 25 years.

As a representative anecdote of the importance of experience, I would like to highlight a case that was given to us in an automative components factory.

The company had installed a high-performance commercial vision system whose objective was to identify various parts based on colour. After several failures, we were asked to help configure the equipment, but instead of acting on these devices, we worked on changing the lighting conditions of the scene and simply turned the spotlights around and placed panels to obtain diffuse lighting instead of direct lighting. This solved the problem and the vision reached the level of reliability required by the client.

In this post, I would like to highlight an important case of success in the automative industry that has had a relevant impact on its production process, this is the SIVAM5 vision system developed by CARTIF and integrated in cold drawing lines of laminated sheet metal.

As we all know, the surface quality of the vehicle´s exterior is key for users, which is why companies in the automotive sector have to make a significant effort to detect and correct the presence of defects in the bodywork of their vehicles. Most of these defects occur at the stamping stage, but considering the inconsistency of the colour of the sheet metal and the generation of diffuse reflections, in some cases these defects go unnoticed to the body assembly stage and then to the painting stage, after which they become noticeable. This means that a small defect not detected in time translates into a large cost for the production of the vehicle.

To detect these defects at an early stage, we have developed an innovative machine vision system to detect the micro-cracks and pores that are generated in the cold stamping process of rolled sheet metal. This is a clear example of a robust solution based on a simple idea, “the passage of light through the pores of the sheet metal”, but where a great technological effort has been made to implement the idea in the production line. To this end, various optical technologies have been combined with the development of complex mechanical systems, resulting in a high -performance technological solution, capable of carrying out an exhaustive inspection of the critical points of the sheets in 100% of the production and without penalising the short cadence times that characterise press lines.

Thanks to its excellent resistance to vibrations and impacts, its great adptability for the integration of new references and its reliability in the detection of defects, a robust, flexible and reliable solution has been obtained. Based on a simple idea, a robust solution has been implemented in the production process of large companies in the automotive sector, such as Renault and Gestamp, where it has been operating without updates for more than 20 years, working day and night.

SIVAM5 multicamera visual inspection system
Hard to measure

Hard to measure

Researchers are increasingly confronted with situations of “digitalise” something that has not been digitalised before, temperatures, pressures, energy consumes,etc. for these cases we look for measure systems or a sensor in a commercial catalogue: a temperature probe, a pressure switch, a clamp ammeter for measuring an electric current, etc.

Sometimes, we find ourselves in the need of measure “something” for which you can´t find commercial sensors. This can be due to they aren´t common measure needs and there isn´t enough market for these type of sensor or directly, doesn´t exist commercial technical solutions available for different reasons. For example, it could be necessary to measure characteristics such as humidity of solid matter currents, or characteristics only measurable in a quality control laboratory in an indirect way and that needs a high experimentation level.

Also, sometimes, characteristics are required to be measured in very harsh environments due to high temperatures, as it can be melting furnace, or environments with lots of dust that saturate any conventional measure system and it may sometimes be necessary to evaluate a characteristic that is not evenly distributed (for example, quantity of fat in a meat piece, presence of impurities). Other factor to take into account is, that not always possible to be installed a sensor without interferences in the manufacturing process of the material that we want to measure, or the only way is taking a sample to realise an analysis out of the line and obtain a value or characteristic time after, but never in real time.

In these situations, it is necessary to resort to custom-made solutions that we call smart sensors or cognitive sensors. Apart from calling them sound exotic or cool, these are solutions that need to use a series of “conventional” sensors together with software or algorithms, for example, artificial intelligence, that process the measurements returned by these commmercial sensors to try to give as accurate an estimate as possible of the quality we want to measure.

Nowadays we are developing these types of smart sensors for different process industries such as asphalt manufacturing, steel billet and bars or pharmaceutical industry (e.g. pills) in the framework of the European Project CAPRI.

For example, in the manufacture of asphalt, sands of different sizes need to be dried before they are mixed with bitumen. During the continuous drying process of these sands, the finer sand size, called filler, is “released” in the form of dust from larger aggreggates and this dust needs to be industrially vacuumed using what is called a bag filter. Nowadays, the drying and suction of filler is done in a way that ensures that all the filler is extracted. The disadvantage of this process is that it is actually necessary to add additional filler when mixing the dried sands with the bitumen, because the filler improves the cohesion of the mix by filling the gaps between the sand grains. All this drying and complete suction of the filler entails an energy cost that, in order to try to minimise, it would be necessary to have a measure of the filler present in the sand mixture. Today, this measurement is obtained in a punctual way through a granulometric analysis in a laboratory with a sample of the material before drying.

Within CAPRI Project we are working on the complex task of being able to measure the flow of filler sucked in during the drying process. There is no sensor on the market that are guaranteed to measure a large concentration of dust (200,000 mg/m3) in suspension at high temperatures (150-200ºC).

Within the framework of the project, a solution to this problem has been developed, you can consult the laboratory results in the research article recently published in the scientific journal Sensors (“Vibration-Based Smart Sensor for High-Flow Dust Measurement”)

The development of this type of sensors requires various laboratory tests to be carried out under controlled conditions to verify the feasibility of this solution and then, also under laboratory conditions, to carry out calibrated tests to ensure that it is possible to estimate the true flow of filler sucked in during the sand drying process. CAPRI Project has successfully completed the testing of this sensor and others belonging to the manufacture of steel bars and pharmaceutical pills.

The Project in its commitment to the open science initiative promoted by the European Commission has published in its Zenodo channel, different results of these laboratory tests that allow us to corroborate the preliminary success of these sensors pending their validation and testing in the productive areas of the project partners. In the near future we will be able to share the results of the industrial operation of this and other sensors developed in the project.


Cristina Vega Martínez. Industrial Engineer. Coordinator at CAPRI H2020 Project