How To Split Large .TSV files and Import to MS Excel or OpenOffice Using EmEditor

| Servers | 48 seen

In this article you will learn how to split large .tsv (and not only) files so you can import them to MS Office, Open Office or whatever you need using a small and neat (paid) program called EmEditor.

Here is the scenario - I have been booking.com affiliate partner for many years already, I remember few years ago I dreamt I wish I would programmatically access to all of their hotels (with metadata) , so I could import those data to some other web app and reuse data I want in manner I want to.

Recently I noticed on my Booking Central there is a new option - Data download, and here it was - All I dreamt for a couple of years ago. Booking.com offered me to download their hotel data in zipped .tsv format. Unarchived my file was about 1GB large.  I tried to import it to both Open Office (Libre Office) and MS Excel, but both programs stopped working and couldn't handle my request. For a moment I felt desperate - I have the data but I can't do much of it.

Booking.com hotel data sets

Booking.com hotel data sets

Now I remembered a few years ago I was building a geospatial web application using data from geonames.org, and back then I was using some tools to split large files into smaller so I could import them to Excel, adjust them how I want and then import data to Drupal using Feeds import.

Unfortunately I couldn't remember what tools I used exactly then, so I started searching on Google, and quickly found tool called EmEditor.


EmEditor is a fast, lightweight, yet extensible, easy-to-use text editor for Windows. Both native 64-bit and 32-bit builds are available!
 

Now EmEditor is not for free (it costs about $40) but it does offer Free Trial version (fully functional) for 30 days.So I downloaded EmEditor, opened my large almost 1GB size .tsv file on it, and stuck for a moment, I couldn't actually find where is that function to split that file. After couple of minutes of searching I finally found this option under Tools - > Split/Combine

EmEditor File Splitter

EmEditor File Splitter

The Split Current Document to Several Files command allows you to split the current document into several files either every user-specified number of lines, or before every bookmarked line. It also allows you to specify a header and/or footer to each separated file. 

It actually took very little time (less than 2 minutes) to split my 1gb file in six smaller files. And after split was done I opened split files in MS Excel, adjusted filters and was ready to develop further my application for importing data in Drupal.

Try EmEditor.

Hope this helps!

Add new comment