Mayur Naik @AI4Code tweet - I am excited to share a preview of Scallop: a new programming language and toolchain for neurosymbolic AI. Website: https://t.co/M5Tx1Otaua (1/18)

Mayur Naik

@AI4Code

4 years

Scallop is a declarative language designed to support rich symbolic reasoning in AI applications. It is based on Datalog, a logic rule-based query language for relational databases. (2/18)

1

3

21

Mayur Naik

@AI4Code

4 years

Scallop includes a scalable solver equipped with support for discrete, probabilistic, and differentiable modes of reasoning. These modes are configurable to suit the needs of different AI applications. (3/18)

1

3

15

Mayur Naik

@AI4Code

4 years

Scallop also provides bindings to support logic reasoning modules within Python programs. Scallop can be deeply integrated with existing PyTorch machine learning pipelines. (4/18)

1

2

11

Mayur Naik

@AI4Code

4 years

Why design a new language, and why now? The complementary benefits of deep learning and symbolic reasoning have spurred a small but growing body of researchers to explore a hybrid approach, called neurosymbolic AI. (5/18)

1

2

13

Mayur Naik

@AI4Code

4 years

Many works have recently demonstrated that neurosymbolic methods outperform end-to-end neural approaches on a variety of AI tasks. @GaryMarcus’s article ( https://t.co/PQbYGbQYgi) provides references and commentary. (6/18)

1

4

32

Mayur Naik

@AI4Code

4 years

Overarching goal of Scallop is to help shape the future of AI in aspects of data efficiency, safety, explainability, and equity. This will require researchers working on LLMs, vision, NLP, and formal methods to work together. Scallop aims to facilitate such collaboration. (7/18)

1

9

Mayur Naik

@AI4Code

4 years

To illustrate Scallop, consider the image classification task Pathfinder/Path-X from Google Research’s Long Range Arena benchmark for efficient transformers ( https://t.co/TcMa3t8VwJ). In this task, a model must determine whether the two dots are connected by dashed lines. (8/18)

1

9

Mayur Naik

@AI4Code

4 years

Modern transformers achieve 77% accuracy on Pathfinder (32x32 pixel images, or 1024 token sequences), and cannot improve above 50% random guess on Path-X (128x128 pixel images, or 16K token sequences). (9/18)

1

0

7

Mayur Naik

@AI4Code

4 years

This task can be programmed in just a few lines in Scallop using a single CNN for detecting dots and dashes, together with three logic rules below, to achieve 90.6% accuracy on Pathfinder, and 89.5% accuracy on Path-X. (10/18)

1

2

23

Mayur Naik

@AI4Code

4 years

What are the key design choices in Scallop? 1. Relational Data Model 2. Rule-Based Reasoning 3. Loss Function Shaping 4. Data Provenance 5. Compositionality (11/18)

1

0

17

Mayur Naik

@AI4Code

4 years

(1) Symbols in Scallop are relational tuples. Why? The relational data model has stood test-of-time in data processing. Relations are general (can represent graphs). Much of world’s data is in relational DBs. And we can leverage optimizations from DB research to scale. (12/18)

1

0

14

Mayur Naik

@AI4Code

4 years

(2) Reasoning in Scallop is specified via logic rules. Why? Rules can express rich reasoning patterns (KG, causality, counterfactuals). Rules are regularizable, composable, and interpretable. Rules can even be learnt from data using advances in program synthesis / ILP. (13/18)

1

0

12

Mayur Naik

@AI4Code

4 years

(3) Loss Function Shaping. Scallop does not require neural+reasoning modules to be end-to-end differentiable, as @jcbaillie also advocates ( https://t.co/9Nl68HJkfE). Scallop allows loss function to be shaped by choosing from extensible+convenient library of semirings. (14/18)

1

2

13

Mayur Naik

@AI4Code

4 years

(4) Data Provenance. Scallop supports emerging use-cases like data-centric AI ( https://t.co/cs5fUf0dJI), machine unlearning, and XAI. How? Via provenance information which can be tracked efficiently during evaluation of Datalog rules. (15/18)

1

0

11

Mayur Naik

@AI4Code

4 years

(5) Compositionality. As argued by @raphaelmilliere ( https://t.co/yTVhbiI8Ly), neural models struggle with compositionality, despite recent breakthroughs. Scallop can serve as glue to combine results from neural models in multi-modal applications, such as vision+NLP. (16/18)

1

9

Mayur Naik

@AI4Code

4 years

What are we excited about next? Building real applications using Scallop that: 1. Cooperate with LLMs (e.g. Q&A \w coherent train of thought). 2. Have high safety requirements (e.g. healthcare/robotics). 3. Involve limited data and rich domain knowledge that LLMs lack. (17/18)

1

2

11

Mayur Naik

@AI4Code

4 years

We plan to open-source Scallop. You can download binaries from https://t.co/M5Tx1Otaua. We will present a tutorial at https://t.co/GQ5UR3CMlq and make lectures available. If you are interested in using Scallop or collaborating, please get in touch! (18/18)

5

45

Barney Pell

@barneyp

4 years

@AI4Code Looks great!

0

1

Tim Cash

@timcash

4 years

@AI4Code Does Scallop have a code-completion written in Scallop? Heading over to the docs to try this out. Thank you for sharing. Semirings have no additive inverse - thanks wikipedia https://t.co/qSvwbLDo9q

0

1