3.2A Unified Abstraction lists a seamless integration of relational algebra andlinear algebra as one of the current open research problems.They highlight the need for a holistic framework that sup-ports both the relational operations required for the featureengineering phase and the linear algebra support needed forthe learning algorithms themselves. AIDA accomplishes thisvia a unified abstraction of data calledTabularData, provid-ing both relational and linear algebra support for data sets.TabularData.TabularData objects reside in AIDA, andtherefore in the RDBMS address space.They remain inmemory beyond individual remote method invocations. Tab-ularData objects can work with both data stored in databasetables as well as host language objects such as NumPy ar-rays. Users perform linear algebra and relational operationson a TabularData object using the client API, regardlessof whether the actual data set is stored in the database orin NumPy.Behind the scenes, AIDA utilizes the underly-ing RDBMS’s SQL engine to execute relational operationsand relies on NumPy to execute linear algebra.When re-quired, AIDA performs data transformations seamlessly be-tween the two systems (see Figure 2) without user involve-ment, and as we will see later, can often accomplish thiswithout the need to copy the actual data.1402
RDBMSNumPyEmbedded Python InterpreterTabularDataMaterialize MatrixLinearAlgebraOperators+ * @…columndataRelationalOperatorsTable UDF∏s…DB Table /ResultsetNumPyArrayvirtual columnsFigure 2: TabularData AbstractionLinear algebra and relational operations.AIDA cashesin on the influence of contemporary popular systems for itsclient API. For linear algebra, it simply emulates the syntaxand semantics of the statistical package it uses: NumPy.For relational operators, we decided to not use pure SQLas it will make it difficult to provide a seamlessly unifiedabstraction.Instead, we resort to object-relational map-pings (ORMs) , which allow an object-oriented view andmethod-based access to the data in database tables. Whilenot as sophisticated as SQL, ORMs are fairly versatile. ORMshave shown to be very useful for web-developers, who are fa-miliar with object-oriented programming but not with SQL.ORMs make it easy to query the database from a proce-dural language without having to write SQL or work withthe nuances of JDBC/ODBC APIs.By borrowing syntaxand semantics from ORM – we mainly based our system onDjango’s ORM module, a popular framework in Python  –we believe that data scientists who are familiar with Pythonand NumPy but not so much with SQL, will be at ease writ-ing database queries with AIDA.3.3Overview ExampleLet’s have a look at two very simple code snippets that canbe run in a client-based Python interpreter using AIDA’sclient API. The first code snippet represents a relationaloperator as it accesses thesuppliertable of the TPC-Hbenchmark  to calculate the number of suppliers andtheir total account balance.