Putting some magic into PostgreSQL (part 1/3)

Standard

PostgreSQL is a relational database management system created by the almighty Michael Stonebraker, the 2015-Turing-award-winner-creator-of-all-that-is-good-and-awesome. The same person that create HP Vertica. Long story short, PostgreSQL is a great data management tool.

PostgreSQL is a magical place where gnomes and elfs live in harmony and true happiness. These magical creatures are quite awesome and they know their place in the Universe. Moreover, the PostgreSQL environment is aimed to include new beings in his magic Zen forest. This is more evident when looking at the ton of PostgreSQL’s data types such as network address, figures (polygons) and geo-data. But what’s insane is its features to develop and use User Defined Functions (a.k.a. UDFs).

UDFs are basically programs that you can execute within your query. Nope! I’m not high, PostgreSQL let you import your programs into a query. You may probably used a UDF before if you worked with any DBMS that supports SQL. Does the functions avg, countmin and max sound familiar to you? Those are examples of UDFs. The DBMS is in charge of load and run your code, previously compiled as a shared library. You may be wondering How’s this even possible? worry not child. Here is a basic guide you can follow to implement C-UDFs in PostgreSQL.

  1. Write an example using the PostgreSQL UDF Library.
  2. Compiled your code into a shared object.
  3. Indicate to the DBMS where is the shared object
  4. Indicate to the DBMS what’s the function’s declaration (input and output)
  5. Name your UDF so you can call it in a SQL query.

The following was taken from [1]. First your code. Let’s write an example:

/* udf_test.c */
#include "postgres.h"
#include 
#include "fmgr.h"
 
PG_FUNCTION_INFO_V1(add_one);   
Datum add_one(PG_FUNCTION_ARGS)
{
    int32   arg = PG_GETARG_INT32(0);
 
    PG_RETURN_INT32(arg + 1);
}

The above example is a UDF that receives an integer and returns the value of that integer plus 1. Nothing fansy just a probe. Next lets see how to compile this example. To do this, we execute the next commands:

gcc -fpic -c udf_test.c
gcc -shared -o udf_test.so udf_test.o

The result of the compilation is the udf_test.so file, this is a shared library ready to be linked by our friend PostgreSQL. Now we connect to our database and put the next instructions.

CREATE FUNCTION add_one(INTEGER) RETURNS INTEGER
AS 'DIRECTORY/udf_test.so', 'add_one'
LANGUAGE C STRICT;

You have to substitute DIRECTORY by the complete directory where your shared library is located. The only thing left is use the UDF in a query. An example of a query is as follows:

SELECT add_one(1);

The result of the query should be 2. Once you complete the above, you’re ready to start puking rainbow from your mouth. Also, the sky is the limit. You can test your code, analyze data from different sources and most of all extend the SQL language. The SQL has numerous uses in a database context, one can edit the database structure, user’s permissions and query data. But, when it comes to analyze, to do some complex math (like matrix operations). Most DBMSs suffer a computational performance bottleneck while calculating matrix operations. The experiments in [2] demonstrate that the use of UDFs can improve the query performance.

Well that’s all for this time folks! In the next time will treat a more complex UDF and see what can we do with them, like the work made in [3] (:D). Until the next time! Have a nice day!

Daniel.

References

1. PostgreSQL Site, http://www.postgresql.org/docs/8.0/static/xfunc-c.html. Visited in September 11th 2015.

2. Zhibo Chen and Carlos Ordonez. 2008. Efficient OLAP with UDFs. In Proceedings of the ACM 11th international workshop on Data warehousing and OLAP (DOLAP ’08). ACM, New York, NY, USA, 41-48. DOI=10.1145/1458432.1458440 http://doi.acm.org/10.1145/1458432.1458440

3. Daniel E. Lopez-Barron, Ines F. Vega-Lopez, and Oswaldo Cuen-Tellez. Supporting Query By Content on ECG Data with User Defined Functions. Proceedings of the IASTED International Conference Advances in Computer Science. Pages 392-399. Phuket, Thailand. April 2013. ISBN: 978-0-88986-946-2.

4. Oswaldo Cuen-Tellez and Ines F. Vega-Lopez. Towards a Formal Model for ECG Data Analysis and Decision Making. Proceedings of the Fourth International Workshop on Knowledge Discovery, Knowledge Management, and Decision Making. Pages 225-231. Mazatlan, Mexico. November, 2013. ISBN: 978-94-91216-98-5.

Leave a Reply

Your email address will not be published. Required fields are marked *