Tuesday, September 28, 2010

Prepare to analyze a huge blogger blog with Nvivo 8

(Finally got all files imported.  Please check update posting on how I got the entire blog and more imported to Nvivo 8 SP4.)

This problem I am trying to tackle now... and since yesterday....

How do you analyze a blog with over 1500 postings with Nvivo 8 SP4? (A price I have to pay for having this congenital disease called verbal diarrhea... 8-X lol)

After a whole lot of trial and errors, this is what I have gotten so far these past two days.


Gathering all documents on my hard drive:

I downloaded the entire Ratology Reloaded blog using HTTrack.  Apparently, one can also use Adobe professional to convert the entire site into Pdf files directly; yet, since I already had the entire site downloaded and I didn't want to convert all external links into Pdf files, I didn't go through with the Adobe professional to download the whole site.

Converting documents to word/text files: 

Before you want to analyze anything with Nvivo, you need to be able to get the info in.  If there were fewer number of postings, such as something like 100, now, I would consider simply copying and pasting.  Yet, since I have 1500 something postings, I have to find ways to batch process documents into something Nvivo could handle.

Nvivo 8 doesn't support html files.  Thus, you either have to find a program to convert html files directly into word documents or you have to find a program to convert html files into Pdf then into word documents.

Partially having to do with the fact that I couldn't really find a free/trial software that allow me to batch file conversion from HTML to word and retain the pictures, I had converted all files from html to Pdf using the batch conversion option of Adobe professional last night.... 

Unfortunately, this morning, I found out that... oops... I had problems importing pdfs into Nvivo while, apparently, Nvivo 8 doesn't quite handle Pdf all that well.  Most importantly, it would be crazy to import all 1500 pdf files into Nvivo anyways... duh... lol

What I have accomplished so far is to test out various programs (trial version) that allows for batch conversion of files from Pdf to word.  Although others might have their preferences, I find my preference to be DeskUnPDF.... as I am typing out this... file conversion in process in the background.

Clean up the files

Some of stuffs on the files could have been cleaned up in batch at html phase.... which I would recommend you to do.

However, I didn't come to the realization until after all files have been converted into Pdfs.  These might include info such as the title of the blog and other stuffs that gets added by blogger and service providers alike automatically to your post.  Programs such as dreamweaver would allow you to edit out most of them out in batch.

In my scenario, I would have to find a software that allows me to batch edit things out of the word documents... including the notes added by DeskUnPdf to show that the file conversion was conducted using their trial version... 8-O

To Be Continued... since still in the conversion phase.

1 comment:

  1. New NVivo 12 Full Version
    New Update Link Download NVivo 12 Full Version
    dik.si/NNVivo

    ReplyDelete