Understanding and Developing Data Strategy and Monetization
A data strategy is a plan that outlines how an organization will collect, store, manage, and use its data. It is important because it can help organizations to improve decision-making,…
Covering Data Science, Business Intelligence, Technology Industry News Updates and Trends
A data strategy is a plan that outlines how an organization will collect, store, manage, and use its data. It is important because it can help organizations to improve decision-making,…
Data Lakes are the foundation for building AI driven solutions that many organizations need. There is much more information in audios, videos, images and social media chatter that can drive…
Quick reference for essential Git commands.
This article focuses on developing understanding and hands-on skills for carrying out exploratory data analysis over any given dataset.
Introduction To be successful as data analysts you need to learn fundamental data analysis techniques, and data-oriented programming languages, and have a strong background in math. Here are the most…
Are looking for a tool to visualize, explore and analyze data that can be modified and adapted to organizational architecture while it does not cost a fortune? Apache Superset is…
In this code example, we will apply a convolutional neural networks(CNN) to textual data. More specifically, we will use the structure of CNNs to classify text. Unlike images, which are…
In this article, we discuss key concepts of hadoop and answer various common questions related to hadoop.
This tutorial provides, hands on coding examples covering database programming for Greenplum/PostgreSQL using SQLAlChemy library.
This tutorial shares basic database programming code for PostgreSQL and Greenplum using Python and psycopg2 library.
Not many data scientists realize that it is possible and much more beneficial to build data science models directly on top of a data warehouse. Apache MADlib is an open-source…
Q1. What is the minimum no. of variables/ features required to perform clustering? A. 0 B. 1 C. 2 D. 3 Solution: (B) At least a single variable is required…
Python multiple choice questions for learning and self improvement.
Building great data science models is key to success in business. Understanding the difference between data science and business intelligence is key to using these to succeed in your business…
Spark is a fast, easy-to-use and flexible data processing framework. It has an advanced execution engine supporting cyclic data flow and in-memory computing. Spark can run on Hadoop, standalone or…
Deep learning is a field with intense computational requirements, and your choice of GPU will fundamentally determine your deep learning experience. As discussed previously a CPU is divided into multiple…
Greenplum Architecture The main reason behind adaptation of massively parallel processing(MPP), data warehouse(DWH) solution is MPP architectural principles. These principles aim at removing main drawbacks of traditional DWH, and make…
Exploratory data analysis is an important step in any data science and even business intelligence project, which gives you adequate understanding of underlying data.
Database programming in Python using SQLAlchemy library that provides uniform interface for connecting with different databases.
This article provides a good overview of exploratory data analysis techniques, and then provides code examples using seaborn library.