
Researchers at the Massachusetts Institute of Technology are developing new methods that allow artificial intelligence to provide explanations for its decisions, with the aim of enhancing transparency as well as accuracy. This effort seeks to enable humans to understand the rationale behind these expectations.
This field, known as “explainable artificial intelligence,” is increasingly important in critical sectors such as healthcare, transportation, and scientific research. In these areas, users need to understand the factors that led to a particular outcome before trusting or relying on it. For example: a doctor may want to know the reasons that prompted the system to suggest a certain diagnosis, or self-driving car engineers need to understand the signals that the system relied on to interpret a specific traffic situation.
The researchers focus on a methodology called the “conceptual bottleneck model,” which relies on extracting a set of concepts or characteristics that are easy for humans to understand, and then using them as a basis for making decisions. For example: When classifying birds, the system might first recognize visual features such as “blue wings” or “yellow legs” before determining the type of bird. In the medical field, these concepts may include specific tissue patterns that help diagnose diseases.
The biggest challenge with this method is to extract precise concepts that truly reflect the patterns learned by the model, rather than relying on general concepts defined in advance by experts that may not reflect the complexity of the task. To achieve this, the MIT team developed a technology that identifies a model’s internal patterns and converts them into concepts understandable to humans, so that these concepts become part of the model’s decision-making process.
The researchers confirm that this method allows for a balance between accuracy and transparency, as the system focuses on the most important elements in each decision, and reduces reliance on hidden relationships within the model that may be difficult to interpret.
As organizations increasingly rely on AI, these approaches are expected to enhance the reliability of systems, detect potential biases, and ensure algorithms work as expected, representing a critical step toward more accountable and transparent AI systems.