Running MongoVUE on Windows 8 or Windows 8.1

If you are trying to install MongoVUE on Windows 8 or 8.1, then you need to first install .NET Framework 3.5 SP first, as this doesn’t come preinstalled. The quick steps to ready your system are given below.

 

Step1 : Open Control Panel and select Programs

Control Panel

 

Step 2: Under Programs, select ‘Turn Windows Features On or Off’

Programs

 

Step 3: Select Windows

Features

 

That’s it. Simply click the OK button. Windows will download .NET Framework 3.5 and install it too. Now your system is ready.

0 Comments

Build MongoDB indexes visually using MongoVUE

The concepts of building indexes in MongoDB are very similar to those of relational databases like MySQL, SQL Server etc. You can read about these concepts and theory on this webpage. In this tutorial, we’ll cover how MongoVUE GUI makes it easy to create MongoDB indexes.

We’ll refer to the following example used in the 10Gen documentation:

db.factories.insert( { name: "xyz", metro: { city: "New York", state: "NY" } } );

// alternative A
db.factories.ensureIndex( { metro : 1 } );

// alternative B
db.factories.ensureIndex( { "metro.city" : 1, "metro.state" : 1 } );

 

Step 1:

Let’s fire up MongoVUE and navigate to our “infrastructure” database to list “factories” collection

image

 

Step 2:

Right-click on “factories” collection, and select “Add Index…” from the context-menu. This will launch the new index window.

image

 

Now there are 2 options to build indexes. You can either type the Json under the “Json” tab or you can build it visually as shown in the steps below.

Step 3:

Click on the “Visual” tab.

image

 

Step 4:

Now select the fields on which you want to build the index. In our example, we’ll select “metro”, and select an “Ascending” index on it.

image

That’s it. Hit “Create” button and you are done.

You can also choose to expand “metro” key into its sub-keys, and use those to build the index (see below):

image

Again, simply hit the “Create” button and you are done!

Things to note:

  • MongoVUE does not scan each and every document in your collection to show an aggregated document under “Visual” tab. It scans the first 5 documents only.
  • If these first 5 documents do not contain all the fields then you may index some fields under “Visual” tab and then manually switch to “Json” tab to add the additional ones before hitting “Create” button
0 Comments

Establishing connections to servers and replica-sets using MongoVUE

MongoVUE connection manager makes it easy to create and store connections to your MongoDB servers. Establishing a new connection is simple – click on the add icon, you’ll get a blank screen as shown below:

Step 1

image

 

Step 2:

Now simply add in the values as indicated below

Name – a name you want to give to this connection for easy mental recall

Server – IP address of your server or DNS name

Port – the port on which MongoDB server is listening

Username (optional) – username if you are using authentication

Password (optional) – password

Database (optional) – comma separated list of databases, if you want to connect to specific dbs. if you leave this empty then all databases will be available under “Database Explorer”

image

image

That’s it, hit the “Save”, and double click on your connection name to open it.

 

Step 2(b)

Lets us now explore how to connect to replica sets. Let us assume that we have 4 servers in our set:

Server 1: a.replica-set.com  Port: 28001

Server 2: b.replica-set.com  Port: 28002

Server 3: c.replica-set.com  Port: 28003

Server 4: d.replica-set.com  Port: 28004

There is no place in the connection window to enter information on these 4 servers. Some replica sets may have more servers also. So, the trick is to enter the information on all these server on in the “Server:” text box. See image below:

image

The syntax to enter these servers is a comma separated list of server and its port configuration:

a.replica-set.com:28001, b.replica-set.com:28002, c.replica-set.com:28003, d.replica-set.com:28004

Please note that the syntax above, port number is optional. So if you are using default ports then you may simply skip port info:

a.replica-set.com, b.replica-set.com, c.replica-set.com, d.replica-set.com

You must have noticed in the screenshot above that the “Port:” textbox is replaced by a “ConnectTo:” dropdown. When you connect to a replica set, you can provide additional information on which specific server(s) you want to connect to. The options in this dropdown are:

All – Wait for all members of the replica set to be connected.

Primary – Wait for the primary member of the replica set to be connected.

AnySlaveOk – Wait for any slaveOk member of the replica set to be connected (includes primary, secondaries and passives).

You can read more about these options at this url.

That’s it, hit the “Save”, and double click on your connection name to open it.

0 Comments

Connecting to a remote server over SSH

Many users have their MongoDB instance running on a remove server (say in AWS or some other cloud) and for security reasons all/most of the ports on this server are purposely blocked including port number 27017 (which is the default one for MongoDB).

Now MongoVUE doesn’t natively support SSH protocol. So you cannot directly connect to these servers over SSH. Also, there are no SSH options in MongoVUE connections dialog!
mongovue-cannot-ssh-to-server

But there is some good news too. It is fairly easy to setup an SSH tunnel between your PC (client) and your server, and MongoVUE can use this tunnel to connect to your remotely running MongoDB instance.

mongovue-use-ssh-tunnel-on-putty

Let us do this stepwise.

Step 1

Download and install PuTTY. This software will be used to setup SSH tunnel.

 

Step 2

Launch PuTTY, and navigate to the “Connection > SSH > Tunnels” screen

PuTTY SSH Tunnel screen

 

Step 2

For “Source port”, enter the port number you want to utilize on your client PC. We’ll use “5151”.

For “Destination”, enter the IP and port on the remove server to which you wan to connect to. Here we’ll use “127.0.0.1:27017”.

Select the “IPv4” radio button

Configure local port and destination info

 

Step 3

Click the “Add” button

Click the "Add" button

 

Step 4

Now click the “Session” category on the left, and enter IP of your remote server under “Host Name”.

Enter remove server IP

 

Step 5

Click the “Open” button. You will be asked to enter your user and password info on the shell. Once you login, your SSH tunnel is setup!

 

Step 6

We are almost done. Lets fire up MongoVUE and open a new connection windows. On this windows, the server and port number we enter are for PC client end of SSH tunnel. Now if our MongoDB needs some authentication, we can enter that in Username and Password fields.

 

Enter your port number plus auth info

 

Hit the “Save” button and then open the connection!

2 Comments

MongoVUE version 1.3.0 released

Features and upgrades include:

  • Inline editing
    • Users can edit values in TreeView directly by double clicking on a cell. (Note: The new value must be of the same type as the existing value. Direct editing of Arrays, Documents and Binary values is not yet supported)
  • Find View
    • A new ‘Find’ view has been introduced. The old view has been deprecated to ‘Find 2′
    • Both these views run in the background, so gui stays responsive
  • TreeView
    • For binary data, the subtype is now displayed in 3rd column (‘Type’)
    • Right click menu has 2 new items ‘Copy’ and ‘Copy Json’
    • BUG FIX: ‘Send To’ shortcut initializes the new view with entire document (formatted json code)
  • MapReduce View
    • Execution runs in the background, so gui stays responsive
    • New toolbar introduced on top
  • Data Import from RDBMS
    • Postgres db is now supported (paid version only)
    • New ‘Selet All’ checkbox added as a shortcut to select/de-select all tables
    • BUG FIX: Multiple bugs fixed in importing data from MySQL (application freeze)
  • Underlying text editor used in many views upgraded to be faster, snappier
  • Database names can contain ‘-’
  • Input window that creates new db/collection now accepts ‘Esc’ and ‘Enter’ keys
  • BUG FIX – Using admin db username/password now correctly authenticates with all dbs
  • Code refactored (major)
0 Comments

How to perform MapReduce operations in MongoVUE

This post will detail the steps (and corresponding screens) in MongoVUE for solving the MapReduce problem defined in an earlier tutorial – Yet another MongoDB MapReduce tutorial (should be read before reading the present tutorial).

Step 1

Open MongoVUE and connect to the server that contains the collection “cities”

Connect to MongoDB 

 

Step 2

Right-click on “cities” collection under “Database Explorer”, and select “MapReduce”. This will launch the MapReduce view.

Launch the MapReduce view

 

Step 3

Write the JavaScript code for Map function in “Map” tab.

Write Map code

 

Step 4

Go to “Reduce” tab and enter your JavaScript Reduce code.

Write Reduce code

 

Step 5

Go to “Finalize” tab and enter your JavaScript Finalize code.

Write Finalize code

 

Step 6

Go to “In & Out” tab. Enter the Json code under “{Query}” to exclude cities from USA.

Enter Input options under {Query}

 

Step 7

We are almost done, but before we run this program, let’s just save it to disk first. Click on the small arrow next to “Go!” button and select “Save As”. You will be prompted for a filename. Enter a suitable name, and your MapReduce code will be saved to a corresponding “.vumr” file on your computer.

image

 

Step 8

We are ready to roll. Just click the “Go!” button, and that’ll start the MapReduce operation. At the end, you’ll see the results in the bottom pane. You can also check the time taken in the statusbar. Additionally, the shell command is also available under “Learn Shell” toolbox.

MapReduce is done

 

This completes our tutorial. You can check the Learn Shell toolbox, it displays the following command.

 

db.runCommand({ mapreduce: cities,
 map : function Map() {
	var key = this.CountryID;
	emit(key, {
		"data":
		[
			{
				"name" : this.City,
				"lat"  : this.Latitude,
				"lon"  : this.Longitude
			}
		]
	});
}
 reduce : function Reduce(key, values) {

	var reduced = {"data":[]};
	for (var i in values) {
		var inter = values[i];
		for (var j in inter.data) {
			reduced.data.push(inter.data[j]);
		}
	}

	return reduced;
}

 finalize : function Finalize(key, reduced) {

	if (reduced.data.length == 1) {
		return { "message" : "This Country contains only 1 City" };
	}

	var min_dist = 999999999999;
	var city1 = { "name": "" };
	var city2 = { "name": "" };

	var c1;
	var c2;
	var d;
	for (var i in reduced.data) {
		for (var j in reduced.data) {
			if (i>=j) continue;
			c1 = reduced.data[i];
			c2 = reduced.data[j];
			d = Math.sqrt((c1.lat-c2.lat)*(c1.lat-c2.lat)+(c1.lon-c2.lon)*(c1.lon-c2.lon));
			if (d < min_dist && d > 0) {
				min_dist = d;
				city1 = c1;
				city2 = c2;
			}
		}
	}

	return {"city1": city1.name, "city2": city2.name, "dist": min_dist};
}
 query : { "CountryID" : { "$ne" : 254 } }
 out : { inline : 1 }
 });
2 Comments

Viewing hierarchical data in Table View

MongoVUE provides 3 different views of data – TreeView, TableView and TextView. The TableView provides a simplified representation of hierarchical data in 2 dimensions (akin to tables in relational databases). Some new improvements have been added in version 0.9.0 and these are discussed below.

Like always, let’s explore these enhancements with an example. We’ll take a collection named “countryInfo” which contains information on all countries. Each document in “countryInfo” looks like this:

{
  "_id": "Albania",
  "value": {
    "Capital": "Tirana ",
    "Currency": "Lek ",
    "Population": 3510484
  }
}

In other words, each document’s _id field is the unique Country name and the value key contains a sub-document with important information.

The TreeView (expanded for Albania) is shown below.

TreeView

 

If you go to TableView, you’ll notice a new bar/widget at the top displaying quick information on number of documents. 

New additions to MongoVUE TableView

You’ll also notice that if the cell data is a document then a small green colored arrow (pointing to right) is shown in the cell. This green arrow simply means that there is more information available in this cell and to get it, you simply have to double-click on that cell. So let’s double-click in the cell for Albania to dig in.

 

MongoVUE displaying hierarchical data

You can see that the bar at top now displays your trail (breadcrumbs): “100 Documents >> 1 >> value”. This tells you that out of 100 available documents, you chose the 2nd document (1st is index 0), then you are looking at its “value” key. You can click on any of these crumbs in the trail to go back.

1 Comment

Exporting data from MongoDB

While working with MongoDB you’ll often run into situations where you need to get data out of your server. MongoVUE has some built-in exporting capabilities that make it fairly simple to fetch data in CSV and Microsoft Excel formats. Let’s explore this feature in detail.

Data can be exported from “View” and “Find” options through the “Refresh" dropdown and “More” dropdown respectively, as displayed below.

Exporting MongoDB data through View option

Exporting MongoDB data through Find option

 

Upon clicking the Export menu, you’ll get a popup window as shown below. The combo-box at the top allows you to select the format in which you want to export the data. Currently following 3 formats are available:

  • CSV or Comma Separated Values
  • TSV or Tab Separated Values
  • MS Excel format

Export Documents from MongoDB

Each format displays a number of settings (fairly obvious) that allow you to fine-tune the data exported. Majority of times, these setting should work as is, without any modifications.

 

Now lets get to the last step – MongoVUE gives you 3 buttons for exporting your data.

  • Clipboard: When you click on this button, the selected data is copied to clipboard. You can now go to a suitable application and “Paste”/Ctrl+V to get this data.
  • Instance: Launches the notepad (or the correct application) and with the selected data in it.
  • File: Opens a “File Save As” dialog, allowing you to save the selected data to a file on your disk.

 

Please note that support for other export formats (like Json and Xml) are on the roadmap and will be added in future.

7 Comments

Yet another MongoDB Map Reduce tutorial

Background

As the title says, this is yet-another-tutorial on Map Reduce using MongoDB. But two things that are different here:

  1. A problem solving approach is used, so we’ll take a problem, solve it in SQL first and then discuss Map Reduce.
  2. Lots of diagrams, so you’ll hopefully better understand how Map Reduce works.

 

The Problem

So without further ado, let us get started. We’ll use the GeoBytes’ free GeoWorldMap database. It is a database of countries, their states/regions and major cities. You can find this database on this page under Geobytes’ Free Services section. The zip archive contains CSV files and instructions on importing this data to MySQL are available here.

The task is to find the 2 closest cities in each country, except in United States. (I excluded USA because over 75% of the cities in “cities” table are from USA, and by excluding it the results arrive much faster! Plus, it gives an additional flavor to the task.)

alt
This image on top displays field and corresponding datatypes of “cities” table. Note the fields CountryID, Latitude and Longitude.

 

Assumptions

For sake of simplicity, we’ll represent earth as a 2D plane. The distance between any two points P1 (x1,y1) and P2 (x2,y2) on a 2D plane is computed as Square-Root of { (x1-x2)2 + (y1-y2)2 }

 

SQL Solution

If the distance between each pair of cities in a country were known then we could simply apply a GROUP BY statement where we divide the data by Country and find those two cities where the distance is minimum. Since data is not available in this form, let’s try to manipulate it to get the desired structure.

/* QUERY1 - VIEW: city_dist */
create view city_dist as
select c1.CountryID,
	c1.CityId, c1.City,
	c2.CityId as CityId2, c2.City as City2,
	sqrt(pow(c1.Latitude-c2.Latitude,2) + pow(c1.Longitude-c2.Longitude,2)) as Dist
from cities c1 inner join cities c2
where c1.CountryID = c2.CountryID /* Country should be same */
and c1.CityId < c2.CityId  /* Calculate distance between 2 cities only once */
and c1.CountryID <> 254 /* Don't include US cities */; 

 

Now that we have distance between each pair of cities, we can now group this data by country and then proceed to select those 2 cities that have the least value for “Dist” field but still greater than zero. This can be accomplished easily as shown below:

 

/* QUERY 2 */
select city_dist.*
from (
	select CountryID, min(Dist) as MinDist
	from city_dist
	where Dist > 0 /* Avoid cities which share Latitude & Longitude */
	group by CountryID
) a inner join city_dist on a.CountryID = city_dist.CountryID and a.MinDist = city_dist.Dist;

 

That completes our SQL solution to the given problem. (You can delete the View “city_dist” later)

It is important to note the steps we followed. In the first step we performed all the computations (by calculating the distance between 2 cities of each country). In the next step we grouped (or divided) our results by country and selected those 2 cities where the value of distance was least. These steps can be represented graphically as shown below.

SQL 2 step solution

 

 

Map Reduce Solution

We can easily import our “cities” table from MySQL to MongoDB using MongoVUE. Instruction on importing are available here. Once this is done, a sample document in MongoDB looks like this:

Sample document in "cities" collection

 

Map Reduce is a 3 step approach to solving problems.

Map Reduce 2 step approach

 

Step 1 – Map

Map step is used to group or divide data into sets based on a desired value (called Key). This is actually similar to Step 2 of SQL solution above. The Map step is accomplished by writing a JavaScript function, and the signature of this function is given below.

function /*void*/ MapCode() {

}

 

In other words the Map function takes no arguments and returns no data! That doesn’t seem much useful, does it? So lets explore it in greater detail.  Although Map function doesn’t take any arguments, it gets invoked on each document of the collection as a method. Since it is invoked as a method, it has access to “this” reference. So with “this” you can access any data within the “current” document. Something else that is available is the “emit” function and it takes two 2 arguments, first, the key on which you want to group the data. Second argument is the data itself that you want to group.

When we write the Map function, we need to be careful about 3 things.

  1. Firstly, how do we want to divide or group the data? In other words, what is our key? Or what should be passed as the first parameter to “emit” function?
  2. Secondly, what part of the data will we we need or what part of data is extraneous? This helps un in determining the second parameter passed to “emit” function.
  3. Thirdly, in what form or structure do we need our data? This helps us refine the second parameter of “emit” function.

Let’s find the answers to these questions.

  1. It should be quite evident that we will group our data based on “CountryID”. We used the same field in SQL too. So we’ll pass “CountryID” as the first parameter to “emit” function.

 

function MapCode() {
	emit(this.CountryID, ...);
}

 

We certainly don’t care about RegionID, TimeZone, DmaID, County and Code for calculating closest cities. We can easily ignore these. Keys that seems helpful are CityId, City, Latitude and Longitude.

 

function MapCode() {
	emit(this.CountryID,
	{
		"city": this.City,
		"lat":  this.Latitude,
		"lon":  this.Longitude
	});
}

 

With this we have answered our second question as well, i.e. what data is extraneous and what is necessary. Now before we get to the third question above, lets understand a bit more about Reduce. After the Map step completes we obtains a bunch of key-value pairs. In our case, we’ll get a bunch of key-value pairs (where key is CountryID and value is a Json object) as shown in the image below:

emit Function outputs Keys & (associated) Values

Reduce operation aggregates different values for each given key using a user defined function. In other words, Reduce operation will take up each key (or CountryID) and then pick up all the values (in our case Json objects) created from Map step and then one-by-one process them using a custom defined logic. Lets look at the signature of Reduce function.

 

function /*object*/ ReduceCode(key, arr_values) {

}

 

Reduce takes 2 parameters – 1) Key 2) An array of values (number of values outputted from Map step). Output of Reduce is an object. It is important to note that Reduce can be called multiple times on a single key! Yes, you read it correctly. It is not that difficult to think actually – consider a case where your data is huge and it lies on 2 different servers. It would be ideal to perform a Reduce on the given key on first server, and then perform a Reduce for the same key on second server. And then do a Reduce on the results of these two reduced values.

Here is a picture explaining Reduce step.

Reduce step

The picture above shows Reduce being called twice. This is just can example. To be frank, we don’t know how MongoDB executes Reduce. We don’t know which key it is going to be reduced first and which key last. We also don’t know how many times it is going to call reduce for a key. This optimization is better left with MongoDB itself as it finds the most suitable parallel execution for every MapReduce command.

What we do know is that if Reduce is executed more than once then the value returned will be passed in a subsequent reduce as part of input.

For our given problem, we want Reduce to output all the cities of a given country (so that we can then try to find the closest two). So the expected format of final reduced value (rF) is:

 

{
	"data" : [
			{ city E },
			{ city B },
			{ . . .  }
	]
}     

 

But the input values in Reduce array (param 2) should have exactly the same format as the output, as the output may be intermediate and may participate in further Reduce. So lets mould the Map function to produce values in  the above desired format.

 

function MapCode() {
	emit(this.CountryID,
	{ "data":
		[
			{
				"city": this.City,
				"lat":  this.Latitude,
				"lon":  this.Longitude
			}
		]
	});
}

 

Our reduce function simply assimilates all the cities.

 

function ReduceCode(key, values) {

	var reduced = {"data":[]};
	for (var i in values) {
		var inter = values[i];
		for (var j in inter.data) {
			reduced.data.push(inter.data[j]);
		}
	}

	return reduced;
}

 

This brings us to Finalize step. Finalize is used to do any required transformation on the final output of Reduce. The function signature of Finalize is given below:

 

function /*object*/ FinalizeCode(key, value) {

}

 

The function takes a a key value pair, and outputs a value. After the Reduce is complete, MongoDB runs Finalize on each key’s final reduced value. The output of Finalize for all keys is put in a collection, and it is this collection which is the result of Map Reduce. You can give it a desired name, and if left unspecified, MongoDB selects a collection name for you.

image

In our case, we’ll use Finalize to find the closest 2 cities out of all the given cities in a country. Here is the Finalize function.

 

function Finalize(key, reduced) {

	if (reduced.data.length == 1) {
		return { "message" : "This Country contains only 1 City" };
	}

	var min_dist = 999999999999;
	var city1 = { "name": "" };
	var city2 = { "name": "" };

	var c1;
	var c2;
	var d;
	for (var i in reduced.data) {
		for (var j in reduced.data) {
			if (i>=j) continue;
			c1 = reduced.data[i];
			c2 = reduced.data[j];
			d = Math.sqrt((c1.lat-c2.lat)*(c1.lat-c2.lat)+(c1.lon-c2.lon)*(c1.lon-c2.lon));
			if (d < min_dist && d > 0) {
				min_dist = d;
				city1 = c1;
				city2 = c2;
			}
		}
	}

	return {"city1": city1.name, "city2": city2.name, "dist": min_dist};
}

 

This completes our MapReduce solutions as well. We just need to filter out US cities when we invoke this – that is easy enough to do with a simple condition:

 

{
	CountryID: { $ne: 254 }  /* 254 is US CountryID */
}

Points to note

  • While this is clearly not intended to be a benchmark, but still, the SQL solution took about 100 sec on my laptop (the view creation took only 1 sec, rest is spent in grouping and joins. Using a temp table/indexes would speed this up).
  • Map Reduce took 6 seconds to run
  • There are other SQL and MapReduce solutions to this problem. For example, you could open cursors in SQL and iterate through all the records in nested for loops. Similarly, you could do an 2 back to back MapReduce operations without resorting to use of Finalize step. I’ll try to explore these in a future post.

If you want to learn about how to execute these steps in MongoVUE, then refer to this step-by-step tutorial.

14 Comments

Monitor your MongoDB servers

FourSquare is a popular website that uses MongoDB to store data. They make especially good use of MongoDB’s 2D indexes to provide geo-location features to their users. FourSquare recently had an outage and their site was unavailable for around 11 hours. Eliot Horowitz (10gen’s CTO) describes reasons for this outage and his thoughts on prevention in this post.

This instance highlighted the critical need to monitor the performance of production servers. There are a number of services available that provide monitoring of servers with the ability to store historical data and provide nice graphs and charts of key metrics including CPU, IO, Memory etc.

The new version of MongoVUE – 0.6.5, includes a simple and easy capability allowing you to monitor your MongoDB’s performance right from your desktop!

Step 1: Fire up MongoVUE. We are not going to use the “Database Explorer”, so lets unpin it. There is a new button available in the toolbar – “Monitoring”, click on this.

Click the "Monitoring" button

 

Step 2: Now click on the “Add Server” button. This brings up the Connection Manager – select the server that you want to Monitor.

Click the "Add Server" button

 

Step 3: The server is added to the screen and you can see it update its values in real time. The default refresh interval is 1 second. You can easily change this.

Selected server ("localhost") has its MongoDB values updated in real-time

 

That is it. You are done. Just keep this window open, and you can continuously monitor your MongoDB performance. To monitor more servers, go to step 2 and add more servers.

 

Important points to Note:

  • MongoVUE does not store any real-time data on disk
  • MongoVUE polls your server for only MongoDB stats, so you won’t get other (typical) monitors like Disk Space, CPU etc
  • MongoVUE uses db.serverStatus() command to get the data. This command requires admin privileges and may not work on some hosted MongoDB services like MongoHQ
1 Comment