Nayomi Chibana

Written by: Nayomi Chibana

August 19, 2015

Data Visualizations: A Beginner’s Guide to Finding Stories In Numbers

data visualization

Have you been looking for a way to stand out in a crowded Web with completely original content–and hit a brick wall?

Whether you’re a content marketer, blogger, editor or entrepreneur, you’re probably well-acquainted with the importance of creating content that is both visually impacting and full of useful, shareable information that is not easily found elsewhere.

The trouble is that this is becoming harder by the second.

With all the clamoring for attention on the Web, the process of creating a piece of content that cuts through all the noise can be extremely challenging–if not seemingly impossible at times.

The truth is that in this age of information overload, we don’t need to learn how to find new data, but how to process and present what’s already out there in innovative ways. This has a name and it’s called data journalism. It simply refers to the process of finding stories in large quantities of data.

How do you create compelling content from data?

While this may sound like something that requires years of training–especially to create all those impressive infographics and data visualizations–it’s actually such a new development within journalism that journalists themselves are learning as they go.

In fact, with the movement of citizen journalism, data journalism skills can now be acquired by any person to keep tabs on their government or disseminate their own news. This is all the more reason for those eager to differentiate themselves from the competition to acquire these skills.

To guide you on your path to developing high-quality content that is a cut above the rest, we’ve outlined some key steps to follow in the creation of original, data-driven content, as well as a list of resources and sites you can resort to in the process.


data visualization_1

1 Get comfortable with spreadsheets.

Although you should have a basic understanding of percentages, averages, ratios and rates, to begin with, you don’t have to be a math genius to create your first piece of compelling data-driven content.

You will need, however, to familiarize yourself with Excel and Google Spreadsheets. To wet your feet, start with this free and easy-to-use Excel tutorial and focus specifically on the sections on Pivot tables and sorting, aggregating and filtering data.


data visualization_2

2 Find and access data.

One of the most challenging parts of creating a data-drive piece is finding usable information. To start, consult these three different sources of data:

Open Databases

Transparency and easier public access to information is a worldwide trend that is here to stay. The digital era has empowered the masses through the release of once-confidential information, and journalists and non-journalists alike should make use of the abundant information that is already on the Web.

To find it, however, you have to know where and how to look.

Google’s advanced search, for example, allows you to narrow your results by specifying a domain extension (such as .gov or .edu) and a file format (such as XLS or CSV). You can also search for part of the URL, as in the case of googling “inurl:downloads filetype:xls,” which enables you to see all Excel files that have the word “download” in their Web address.

You can also conduct specialized searches by using Google Scholar or Wolfram|Alpha, which allows you to search for information based on computational data rather than keywords.

Since countries all over the world are emulating the transparency initiatives of the American and British governments, which have their own data portals (data.gov, fedstats.sites.usa.gov and data.gov.uk), there is also a lot of information comparable across nations that can be reused by businesses and citizens.

UK Government open data portal.

UK Government open data portal.

A comprehensive list of open data portals from all over the world can be found at dataportals.org. If you want to make international comparisons, you can also visit Gapminder and the data portals of the World Bank and the United Nations.

For national demographic data, visit census.gov, and if you want to find information at a local or state level, check out the American FactFinder, Quick Facts and Censtats.

“Evidence suggests that data journalism is the journalism of the future” – Sandra Crucianelli

For very specific queries, you can consult the Web page of the pertinent authority. For example, for immigration statistics, you can consult dhs.gov; for a college navigator, go to nces.ed.gov; for labor statistics, there’s bls.gov; for environmental data, visit epa.gov.

There are also sites that collate data from a variety of organizations, such as Google Public Data Explorer, Data Hub and Freebase. You can also find databases that aggregate research data, such as the UK Data Archive and clinicaltrials.gov.

But what if the information you’re looking for isn’t on the Web? There are two other options you can try:

Gather Your Own Information

You can compile your own data by conducting informal surveys using Google Forms (see tutorial here) or using information from your organization’s own internal database.

While these may be more time-consuming alternatives, they will be well worth your time if they can give you unique insight into your industry.

Freedom of Information Request

A third option is to present a Freedom of Information request, which can be used to access records from any federal agency. The drawback is that these types of requests may take several weeks to process, depending on the agency and the complexity of the request presented.

The key to an expedited response, however, is to demonstrate that you know your rights–citing if necessary the Freedom of Information Act–and to present a request that is as specific and detailed as possible, using technical jargon if required, to save time.



3 Convert Your Data

Once you have the data you need to answer your initial question or to support the point you want to make, you can now convert the data into a format you can work with.

Since you need to import data into an Excel or Google spreadsheet, you’ll want to–whenever possible–download data in CSV format (comma separated value). There are times, however, that you’ll find a chart or graph in a PDF file, in which case you’ll need to insert this data into Excel using a converter, such as Zamzar, Import.io, Tabula or  ScraperWiki. Other times, you may also find useful information as an image, in which case you can use optical character recognition software, such as Free Ocr.



4 Clean Your Data

Now that you have your information in Excel format, you need to clean it up to eliminate inconsistencies and information you don’t need.

For small tasks, information can be cleaned up directly in the spreadsheet, but for larger tasks, a very useful tool is OpenRefine (formerly Google Refine). Here, you can make data consistent, delete duplicate information (make sure you always work with a copy of the original data file) and even merge data sets. More sophisticated uses for this tool can also be found in this helpful tutorial.

Use OpenRefine to

Use OpenRefine to clean your data.

5 Process and Analyze Data

Once your data has been edited and reformatted to suit your purposes, you can start processing the information using the spreadsheet skills mentioned above, such as sorting, filtering and aggregating. For example, you can sort data in ascending or descending order in terms of size or by location; you can calculate and compare means; you can also compare two data sets.

The idea is to analyze your data to find the story and “interview” the information, just as you would any other source. By asking many questions, you will obtain various interpretations of the same data instead of simply sticking with your first reading.

To obtain more robust conclusions, you can also conduct simple statistical and graphical analyses using free software such as RStudio and R-Project.

Use the application R.Studio to conduct statistical analyses of your data.

Use the application R.Studio to conduct statistical analyses of your data.

6 Present Your Findings

Now, here comes the fun part: using data visualizations and infographics to present your findings in a visually appealing and easy-to-understand format. Depending on the type of story you’re working on, you could use a range of visualization tools, from maps and diagrams to interactive charts and network graphs.

For local or national stories on geographic trends, you can use interactive mapping tools such as Google Fusion Tables (which allows you to make maps with several layers), IndiemapperGeocommons and ChartsBin (for clickable world maps).

data visualization_3

Use Tableau Public to create data visualizations.

For studying and visualizing connections within any type of network, there’s also software with free trial periods such as NodeXL, Gephi and UCINET. Some popular analytics tools that enable you to create a variety of data visualizations are Tableau Public and Many Eyes. When it comes to infographics, Visme offers a wide variety of templates to present your data in a colorful and attractive manner.

Once you’re ready to present your cool-looking infographic or data visualization, you’ll find that all the hard work you put into creating this completely original piece of content was worth the effort. Not only will your audience appreciate it, they will gladly share it with the rest of the world for you.

Did you find this post interesting or useful? We would love to hear your opinions, suggestions and storytelling experiences in the comments section below.

About the Author

Nayomi Chibana is a journalist and writer for Visme’s Visual Learning Center. She has an M.A. in Journalism and Media from the University of Hamburg in Germany and was an editor of a leading Latin American political investigative magazine for several years. She has a passion for researching trends in interactive longform media.

8 responses to “Data Visualizations: A Beginner’s Guide to Finding Stories In Numbers”

  1. samar says:

    This is an incredibly comprehensive collection of tools for data science some of which I was not aware of. However, I note that in the RStudio illustration you show an Ubuntu Desktop with Lyx as one of the applications. I personally find that using Lyx with the Rnw (Knitr) module provides a great environment for both writing R code and documenting it. I also tend to use Rattle under R for exploratory analysis. Code can be extracted from R Studio and Rattle and inserted in Lyx to enable one to provide a high quality PDF output in a single document. Lyx implements the code when you initiate the output, and you have the option to show the code or just the code output.

  2. […] which takes you step by step from data to graphic, this site which tells you pitfalls to avoid, and this site which helps you think through some of the issues you will […]

  3. Soumya Roy says:

    Nayomi Chibana, thanks for writing such a brilliant and useful content. It is really helpful for us because we used to do the same but found some important ideas from this post which will make our process more effective and trustworthy. You are right, numbers are always been interesting to human being, but this is our job to present raw data in a more attractive way. Again thanks for sharing this post and actionable tips.

    • Soumya, thank you for taking the time to comment. I’m glad you found it useful. We will be publishing another post shortly on how to create interactive infographics with original and new data with step-by-step instructions, so please stay tuned.

  4. […] new and original to share with your audience. Besides the sites shared on our blog post on how to find interesting stories in numbers, you can also consult the latest studies and research on a particular […]

