Importancia de la historia para las naciones hermanas.

Las historias de cada uno de nuestros países deben ser ejemplo para las demás naciones. Los paises suelen menospreciar estos hechos históricos que deben ser lecciones para aprender ya que si a una…


Serverless Analytics on AWS

On-Line Analytical Processing (OLAP) uses a multidimensional approach to organize and analyze business data. By storing data in highly optimized structures, businesses can very quickly explore the data and uncover important insights that would otherwise remain hidden. As a result, OLAP enables companies to achieve key organizational goals, including wide-ranging business intelligence and analytics.

Running OLAP on AWS is very simple and will help you analyze anomalies, trends and insights in your dataset.

To read data from Amazon Aurora and transform it into parquet format for Athena to query involved the following steps:

Enter the details of database like: database name, username and password.

2. Setup the crawler: Once the RDS connection is established, click on the crawlers tab to add a crawler and enter the details as shown below:

Select JDBC as the datastore and select the RDS connection created in Step 1 for connecting to RDS Aurora DB. You need to give the entire path till the database name and this crawler will create the data catalog of all the tables with all the metadata information.

3. Create Glue job: Click on the jobs tab to create a glue job and enter the details as shown below:

Specify the IAM role and name of the script and path where you need to store the job on S3. Under catalog options, make sure you select the data catalog as the job will require use of data catalog created by crawler in step 2 for fetching data from Aurora RDS.
Once all the above steps are performed, glue job editor will open up where you can start the scripting.

Dynamic Frame vs Data Frame:

A DynamicFrame is similar to a DataFrame, except that each record is self-describing, so no schema is required initially. You want to use DynamicFrame when data that does not conform to a fixed schema.

4. Setup Athena: The above glue job will successfully dump the data into S3 in parquet format partitioned by a specific column.
Create a new crawler and specify the path till the partitioned folder level to create the metadata information for Athena to query.

Add a comment

Related posts:

Field Notes V

Welcome back! Sorry it’s been a while since our last Field Notes, but we’ve been busy with two new investments in InDevR and Themis Bioscience! A Barron’s article from last week provides a great… Read more

The Scurry

A.K. Finch is a writer, teacher, and poet. She lives in Southern California with her husband, daughter, rescue dog and cat. She can be found on Twitter @ak_finch , Facebook @AK Finch, and Goodreads… Read more

Simple Guide to Cryptocurrency Arbitrage

2021 is seeing several free and paid crypto arbitrage tools start to come online. Almost everyone seduces cryptocurrency traders with promises of guaranteed profits when buying and selling digital… Read more