/
Perception as Abduction: Turning Sensor Data Perception as Abduction: Turning Sensor Data

Perception as Abduction: Turning Sensor Data - PowerPoint Presentation

felicity
felicity . @felicity
Follow
68 views
Uploaded On 2023-11-15

Perception as Abduction: Turning Sensor Data - PPT Presentation

Into Meaningful Representation By Murray Shanahan Imperial College London England Paper Review By Christian Hahm Temple AGI Team Introduction 1 Researchers who believe in a computational theory of mind ie AI researchers must explain how their systems symbols acquire semantic ID: 1031994

perception system hypothesis abductive system perception abductive hypothesis robot sensor hypotheses abduction explain information logic explanatory elaborating applying level

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Perception as Abduction: Turning Sensor ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Perception as Abduction: Turning Sensor DataInto Meaningful RepresentationBy: Murray ShanahanImperial College, London, EnglandPaper Review By: Christian HahmTemple AGI Team

2. Introduction (1)Researchers who believe in a computational theory of mind (i.e., AI researchers) must explain how their system’s symbols acquire semantic meaning.The common claim is that the AI system constructs “higher-level” (abstract) concepts and understanding from the “low-level” (sensory) information. To back up the claim, one must also explain in detail how their AI system transforms low-level sensory data to high-level constructionsThis paper offers abductive logical reasoning as the path to perceptionIt will address:UncertaintyTop-down information flowAttention and sensor fusion

3. An abductive account of perception (2)Abduction is: the inferring of explanations based on observations = observed events from the sensory data  = (inferred by the system) a description of the state of the world = the system’s sensorimotor knowledgePerception is the task of finding a  such that:   ⊨ i.e., the system is in a constant loop of attempting to explain the subset of sensory data () by making predictions () based on background knowledge ().⊨ is a “double turnstile” representing semantic inference ⊢ is a “turnstile” representing syntactic inference

4. An abductive account of perception2.1 incompleteness and uncertaintyA cognitive system never has complete knowledge. It must instead function with uncertain and incomplete knowledge.The author argues logic is strong enough for handling uncertainty + incompleteness and thus perception. An example of a logical description generated during visual perception:  =It says: “there is a solid object (x) from side (a) AND the object (x) is a cuboid shape AND there is a face (r) on that solid cuboid (x) AND there is an edge (E0) on the side of that face (r)”.

5. An abductive account of perception2.2 Top-down information flowJust as low-level information builds into high-level understanding (i.e., bottom-up process), high-level cognitive processes influence the perception of low-level information. (i.e., top-down process) The system should produce  according to what the sensations afford the system, that is, the system seeks to form perceptual descriptions which help it to achieve its goals.A top-down mechanism is expectation from past experience (what we call in NARS “anticipation”)Top-down expectations reduce the computational load of finding information in a very complex and cluttered sceneInkblot: what do you see?

6. An abductive account of perception2.3 active perceptionThere are many forms of “active perception”, such as expectation, but also physical movement. It is the result of actions that yield new informationFor example, moving your eye results in new sensations  at a different angle, which may help you analyze a sceneTo generate all  (object descriptions) for a large  (set of sensations) is computationally intractable. Therefore, the system uses attention to keep the size of  smallA full abductive account of perception must loop through these 3 key components:Attention. Through attention based on its expectations, the system focuses on a certain aspect of environment.Explanation. The system turns the attended sensations into explanations about the world. Expectation. Based on its hypothesized explanations, the system can form expectations which are confirmed or refuted as the system lives. This influences attention in the future, thus completing the loop.

7. An abductive account of perception2.4 Sensor fusion“Sensor fusion” is the combining of information from multiple sensors into a unified model of the world may contain information from a single sensor, or from multiple sensorsFor the purposes of perception, it does not matter where the sensations come from, so long as  contains an explanation describing how a configuration of objects () give rise to the various sensations in . should contain formulae that explain past noise and abnormalitiesThen, the explanation  with the highest priority (that best fits the data ) is selected

8. Elaborating and applying the abductive account (3)Now, we will review experiments which utilize the logical approach to perception with real robots

9. Elaborating and applying the abductive account3.1 The rug warrior experiment2-wheel AI robot (the “Rug Warrior”) with sensors to detect bumping into walls and obstaclesExplores by bumping into things, and simultaneously constructing an internal map of the environment. Uses “event calculus” (a way to reason about events using first-order logic)Since the bumping sensor is activated by the robot’s motion, it performs active perceptionVariables = a conjunction of observations, each describing a bump event (e.g., “bump front”, “bump left”) = Background theory describing the impact of the robot’s actions on the world (i.e., movement) and the impact of the world on the robot’s sensors (i.e., “obstacle in front” + “moving forward” = “bump front”)N = a conjunction of sentences, each describing a robot action (e.g., “turn left”, “move forward”)The goal of the robot perception is to find a map of the environment M describing the location of walls relative to the robot’s position such that:  N  M ⊨ This ultimately means the system can learn to predict which bump events  it will experience if it does certain movements N at its current location M according to . Although the Rug Warrior adhered to this formula, the algorithm was special-purpose, not a mechanism which could be generalized since the robot did not have explicit logical representation nor reasoning.

10. Elaborating and applying the abductive account3.2 Reinventing ShakeyThe Khepara robot for the second experiment was built similarly to Shakey the Robot, as a reasoning systemShakey could sometimes get stuck in a very long inference chain, which caused it to work slowly. Unlike Shakey, Khepara worked in real-time, deriving a single step of a proof from logical sentences during a short constant-time working step. The robot performed abductive inference to explain the sensor events in terms of what is happening in the environment.As the robot moved along according to its expectation of a clear path, it continued moving as normal. If there was a sensory anomaly and the only explanation was a blocked path, then the robot triggered replanning.Shakey the Robot: the first embodied reasoning system.Created at the Stanford Research Institute in 1972

11. Elaborating and applying the abductive account3.3 A simple worked example of active perception as abduction (1/4) (Above) the event calculus, used to describe events in the AI system(Below) 3 axioms describing how they are logically related to each otherEvent Calculus

12. Elaborating and applying the abductive account3.3 A simple worked example of active perception as abduction (2/4) It is assumed that the known event occurrences are the only event occurrences, and the known effects of actions are the only effects of actionsThen, given a background knowledge describing effects of actions () through event calculus formulae of Initiates() and Terminates(), in addition to the current context of events and actions () through Happens formula, then the logical conjunction:contains a set of HoldsAt() formulae describing which variables are true. This is the set of the system’s predictions, achieved through deduction.However, this is only the prediction step. We are also interested in explanation using abduction, so the system can determine the causes of sensory events. Step 1: Predictions(deduction)

13. Elaborating and applying the abductive account3.3 A simple worked example of active perception as abduction (3/4) At a given time, the system has a set  of sensory events, a set N describing the robot’s initial state and its subsequent actions, and a description  describing the effects of actions. The abductive task is to find a set of features of the world M such that:In other words, the system must determine the set of knowledge M that, in conjunction with N and , explains the sensory events In this example, we consider a sensor event GoesHigh(Front) which occurs when an obstacle is in front of the robot, and GoesLow(Left) which occurs when there is a gap to the robot’s left.Step 2: explanations(abduction)

14. Consider a world:The world features are described in the AI system denoted by M1 e.g., C1 is an inner cornerAlong with a background knowledge about howto navigate the world, denoted by e.g., if I’m at corner C1 and I can see C2 then by doing FollowWall operation, I will be at C2

15. Elaborating and applying the abductive account3.3 A simple worked example of active perception as abduction (4/4) Up to this point the robot has experienced contextual history N which says: Then suddenly experiences an event  that there is an obstacle in front:The AI system must now use abduction to determine additional world features M2 that explain the current situation, by solving:Note: M = M1  M2 In this case the answer can be solved as M2 = DoorShut(d), since it would imply the next corner (from C1 where the robot started FollowWall) is C4, which in turn would explain the front sensor bump event, Happens(GoesHigh(Front),T2) [that is, if the door swings up!]Step 2: explanations(abduction)

16. Elaborating and applying the abductive account3.4 visual perception through abduction (1/2)The AI robots Rug Warrior and Khepara were proof of concept, to show that logic can be used for simple robot control. But the question remained, whether the ideas would work for richer sensors and motors, for more complex tasks.Ludwig was created: a humanoid bot with 2 arms which are articulated (i.e., robotics term for “with rotating joints”) and a a binocular (2-eyes) camera which can pan and tilt.The robot was constructed with simplicity in mind to simplify the motor control problem and focus on cognition. The project stresses the importance of active embodiment for cognition as necessary, not optional.Ludwig was built to play by interacting with objects and learning about them. Its visual system was meant to pick out objects of interest and direct the camera gaze towards them, then meant to combine the visual information with arm movements to nudge the objects.By the end, Ludwig is meant to construct declarative sentences of logic pertaining to the environment

17. Elaborating and applying the abductive account3.4 visual perception through abduction (2/2)Shanahan develop 2 improvements to the AI system:A numerical measure of explanatory value for comparing existing explanations. It is based on probability, such that the more items a hypothesis explains, the greater its explanatory value The ability to confirm or disconfirm expectations after performing actionsFurthermore, the system is allowed to include “noise terms” in its explanations. Since a noise term has a low probability, the explanatory value of the explanation containing it is also low. This way, a hypothesis which properly explains sensor data has a higher value than a hypothesis which explains it as noise. The abductive processing of low-level sensor data is as follows:Use abduction to infer one or more hypotheses that can explain the sensor dataSelect the best hypothesis, ranked by explanatory valueDetermine what sensor events to anticipate based on the selected hypothesis. Consult the sensor data to confirm or disconfirm the hypothesis.Recalculate the explanatory values of hypotheses based on the anticipations in Step 3.See Figure 3 to the right. First, a Sobel filter and high threshold is applied to the entire raw camera image for edge detection on the 2 Lego bricks. Since the left Lego is darker, the edge detection was weaker than the lighter Lego on the right. These segments are the basis for inference.Now the system has 2 abductive hypotheses to explain the edge data: Hypothesis A is that there is a single Lego, and some noise on the left (incorrect). Hypothesis B is that there are 2 Legos but lots of edges are missing (correct).Generate expectations: a consequence of Hypothesis A is that there should be a vertical edge from the top-left corner of the block (intersection of segments 5-6). A consequence of Hypothesis B is that the top edge of the block should extend further than it appears to. The system can test these expectations using a more sensitive edge-detecting algorithm at the points of interest. This would lead to a decrease in the explanatory value of Hypothesis A, and increase explanatory value of Hypothesis B.

18. Elaborating and applying the abductive account3.5 a worked example of top-down information flow (1/3)First, image data is pre-processed using image segmentation, optical flow, etc. (for this example we use edge detection, see image to the right – line # from point x to point y)The abduction process is structured as multiple layers, where an explanation  at layer n becomes  at layer n+1. For this example we use 2 layers:Layer 1 is meant to explain one-dimensional edges in terms of 2D regionsLayer 2 is meant to explain the 2D regions in terms of 3D objectsEach layer contains a background theory  which contains 3 sets of formulas. Consider layer 1:Formulas used to generate explanatory hypotheses. These state that an line w might be the side of a trapezoidal region r, or potentially it is noise.Formulas used to describe constraints on permitted hypotheses. These state that if there are 2 lines that are a side of trapezoidal region r, then the lines must be parallel or joined:Formulas used to describe the expectations of each hypotheses (described later).

19. Elaborating and applying the abductive account3.5 a worked example of top-down information flow (2/3)Using the edge detections of the previous formulas, the system can form 2 hypotheses. Hyp.A is that 3 and 1 are the edges of a region, while 7 is noise. Hyp.B is that 3 and 7 are the edges, while 1 is noise.: Hypothesis A* Hypothesis B*Since both hypotheses explain 1 region, and contain 1 noise term, they have equal explanatory value at layer 1. Now onto layer 2 formula.This formula to generate hypotheses says if there is a region r there exists a solid x facing us at aspect a, and r is a face of the solid x:This formula representing constraints says that every pair of faces in a “3Face” aspect (see image) is abutting (they share an edge). This rules out some hypotheses.

20. Elaborating and applying the abductive account3.5 a worked example of top-down information flow (3/3)Now in layer 2, we also have 2 competing hypotheses. Hypothesis A contains Hypothesis A*, and B contains B*. Both have detected 2 regions, though hypothesize them in different places. Hypothesis A Hypothesis BHypothesis B is the correct one, however Hypothesis A has higher explanatory value since it contains less noise terms. To perceive the situation “correctly”, the system must use top-down expectation. If layer 2 contains expectation formula. It says if 2 faces abut, there should be a third face that also abuts both of them:The system can run the expectation for each hypothesis. For example, if the system expects a face but it is missing, it can go back to the raw image data and reprocess to try and find the face. Since the properties are highly specific, the search can be very constrained and sensitive.Finally, the explanatory values are recalculated in light of the present and missing expectations. If it is missing, an “abnormality” term is introduced in the hypothesis to indicate that, which similarly to a noise term, lowers the explanatory valueIn each hypothesis, there is an expected missing 3rd region which should abut the 2 hypothesized regions. By checking the image again with a more sensitive edge filter, the 3rd region is confirmed for Hypothesis B but not Hypothesis A, ultimately meaning Hypothesis B wins with the higher explanatory value.

21. Elaborating and applying the abductive account3.6 active visual perceptionThat was static visual perception, now we review active visual perception. To the right are 2 frames from Ludwig’s camera before and after it was nudged by Ludwig.In the left frame, there was a lack of edges – it could either be an L block or T block. In the right frame, the amiguity is resolved.Here a third layer of abduction, meant to explain sequences, is added to supplement the first 2 layers. It includes a timestamp t. At frame 1 there are 2 hypotheses – Hyp.A is that the object is a Lface aspect 4, Hyp.B is that the object is a Tface aspect 4: Hypothesis A Hypothesis BAt frame 2, there are 2 different hypotheses. Hyp.A is that the object is a cuboid aspect 3, Hyp.B is that the object is a Lface aspect 2.The system predicted Lface in both frames (though from different viewpoints) and knows from background theory  that nudging a shape can cause it to show a different aspect. As such, it can predict that the shape is an Lface.

22. Related workIn psychology literature, this work is related to Irvin Rock’s theory of indirect perception (1983) which describes how people create hypotheses about the world and test them. It is based on the ideas of Hermann von Helmholtz.In AI literature, abductive processing can be seen as similar to computer vision technique of matching features to possible objects. However, such systems neither use a formal logic nor perform inference. Instead the high-level knowledge is confined to the structure of objects to be recognized.Here the abductive inference together with confirming expectations is similar to the “hypothetico-deductive” reasoning in logic programming (Kakas and Evans, 1992), though here it is realized in an artificial perception context.There is almost no work on using logic for image interpretation and perception. The few works that exist are theoretical, without concrete experiments (Reiter and Machworth, 1989). The author knows of only one group (other than his own) focusing on logic for perception (Pirri et al. 2002)Chella et al. (2000) is the closest to this work. They developed a vision architecture that uses declarative knowledge and reasoning to suggest hypotheses that explain sensor data. The expectations from these hypotheses control an attention mechanism to further analyse the image. However, the work uses more neural networks and does not emphasize logical inference.

23. Philosophical afterthoughtsThis abductive theory of robot perception was developed in the belief that perception, action, and cognition are inseparable and cannot be studied separately in AI. Especially, it seeks to describe in technical detail how to incorporate action and top-down information flow in logic AI.Logic and philosophy of mind are intertwined, for thousands of years and to the present day. AI is both an engineering and philosophical problem. In the following, the author discusses his philosophical thoughts from the science fiction viewpoint of an engineer working on an advanced humanoid robot

24. Philosophical afterthoughts5.1 representational theories of mindThe engineer reads philosophy. In the 19th century, Brentano issued a challenge to materialists to find a theory of mind that accounts for “intentionality”, the idea that beliefs are always about something, but those things can be non-existent (unicorns) and the beliefs can be objectively false (allows for error).The challenge was met mostly by representational theories of mind, meaning a mental state or belief can be explicitly represented using e.g., a sentence. Some theories account for a type of computation on these sentences.The engineer also reads neuroscience, and notices the contrast between the messiness of the brain and the rigidity of representational theories of mind. She remembers that some philosophers (Churchland, 1979) reject representational theories of mind because of this observation.The engineer realizes that if robots were more advanced, they would be excellent subjects for testing representational theories of mind. Unlike the philosophers debating amongst themselves how to represent such propositions, she immediately realizes first-order logic is suited to the task

25. Philosophical afterthoughts5.2 intentionality and symbol groundingAs the engineer reads more philosophy, she finds commonalities between the objections to representational theories of mind – the first being intentionality. If thoughts are represented as abstract symbols, how can they represent anything real? The best arguments are that the symbols are “grounded” by perception of the physical worldIgnoring the human mind, the path of “groundedness” is quite clear in her Ludwig robot, since we can trace the path from light, to camera, to pre-processor (), to simple logical feature representations, to the final logical object representations.Thanks to logic, the story of perception can be described on a purely mechanical / physical level (according to the algorithm) but also on a symbolic and logical level which is more valuable to humans.For example, a student places a cup on the table. The robot looks at the moving object, recognizes it as a cup, and pours tea into it. The audience applauds. We can explain why the robot “knows” there is a cup on the table past a pure mathematical algorithm. The correctness of the perception process can be logically verified.

26. Philosophical afterthoughts5.3 empiricism and idealismUnfortunately there is a philosopher in the audience who asks how the background theory  gains its meaning, since it is given by the designer.The engineer first claims she is not really interested in that problem only in designing working robots but points out that a symbol like Cup will be causally implicated by the robot seeing a cup on the table and treating it like a cup. It fulfills the same role whether a human designer put it there, or it was learned, or it came from evolutionThe philosopher does not buy this and is still concerned about the origins of The engineer explains that in this case  was indeed handcoded by legions of hard-working PhD students, and it will have to suffice for this generation of robots (2005) but there is other research in Inductive Logic Programming showing that new predicates and symbols can be generated on-the-flyThe philosopher says this is OK, but again there will surely always be some sort of hardcoding to kickstart the process. The engineer agrees, and says the theory must always allow for the sorting of time, objects, spatial location, and also allow the representation of events, cause-and-effect, and spatial relations

27. Concluding remarksOne way to think of this theory of perception is to contrast it to pure sensation. In visual sensation, given a description of the scene, a picture is generated for the viewer. In visual perception, the problem is inverted – given a picture, generate a description of the scene. In both cases, uncertainty is treated by accounting for noise in the mathematical modelUnlike most AI vision works, this work assumes the final product of perception is a qualitative and symbolic description, rather than a quantitative or measured description (which is what we get from the pre-processing algorithm). The bridge is made through abductive inference which operates using high-level knowledgeThis theory accounts for incompleteness and uncertainty, sensor fusion, top-down information flow, and active perception, which all seem necessary for representational theory of mind

28. TAGIT DiscussionAuthor argued that logic can be used to achieve perception, and provided experimental evidence. NARS implements the principles stressed in this paper: hypothesis generation, anticipation, and logical inference. NARS also has the benefit of being able to handle uncertainty using NAL, and generating its own high-level knowledge  through experience (composition and temporal association) rather than hardcoding it ourselves.Similarly to the 3 robotics experiments in this paper, we should also run more robotics experiments in NARS. Think about: What kind of experiments can / should we run?However, aspects of these experiments were idealized to a high degree (e.g., Sobel edge detection with thresholding  full color vision). We should of course reduce the problem as much as possible, by designing very simple experiments, without oversimplifying the problem (i.e., the results should be extendable to more complex experiments)

29. Image Source UrLSInkblot: https://brainfall.com/quizzes/inkblot-quiz-are-you-normal/Expectation sign: https://news.miami.edu/as/stories/2020/02/heller-expectation-research.htmlSensor fusion: https://en.wikipedia.org/wiki/Stimulus_modality#/media/File:Multisensory.jpgShakey the Robot: https://en.wikipedia.org/wiki/Shakey_the_robot#/media/File:SRI_Shakey_with_callouts.jpg