PDA

View Full Version : kinda random question.



Orphious
12-07-01, 10:54 AM
I wonder if you guys could check my logic on thinging on this problem. I am suppost to do the following:

Say I have the following text file.

Pigs and Cows
Are Fat. Porsche is
a nice car. blahblah, you get the point.

1. Create a vector name of 26 binary trees for the letters of the alphabet.

Like such, vector<char> name(26); //isnt that how you declare a set size?

name[0] will take all the words beginning with A
name[1] will take B
name[2] will take C, etc, etc....

So if we read in Dogs it will see the D and insert it into the tree which name[3] points to. This making sense so far? So read in the word, get the first letter, check the appropriate tree, if its not found, insert it. Once your done, display all the words in alphabetic order.

Now, I get the searching and insertion ideas. But how do you make it specific for the particular cell in the vector? Like pass that it is suppost to do this searching in name[2] or name[3]? Any suggestions? Thanks fellas.

Stu
12-07-01, 07:52 PM
Does it have to be a vector? Because a map might be as easy to do, but sounds like it matches what you want to do much better than a vector does (at least in a logical sense).

A map creates a hash array where you have a key (in your case a single character 'a'-'z') and a value (in your case BST<string> or BST<char *>). The only thing is that all keys must be unique (if you need to have multiple instances of the same key you have to use a multimap--mmap). But, according to what you want to do, a map will be fine. Something like this would work:




#include <map>
#include <vector>
#include <string>
#include <cctype> // toupper() function

// Typedef your BST thing to MyBST here

int main(int argc, char **argv){

map<char, MyBST> my_hashmap; // Hash for the object
map<char, MyBST>::iterator hash_it // Iterator to loop through the map
vector<string> words; // Holder for all the parsed words
vector<string>::iterator it; // Iterator to loop through the words

// Parse the words into the vector here

// Loop through the words and insert them...
for(it = words.begin(); it != words.end(); it++)
my_hashmap[toupper(*it[0])].YourInsertMethod(*it);

// Loop through the map and do something...
for(hash_it = my_hashmap.begin(); hash_it != my_hashmap.end(); hash_it++){

// (*hash_it).first refers to the char in map<char, MyBST>
// (*hash_it).second refers to the instance of MyBST that relates to char in map<char, MyBST>
(*hash_it).second.YourMethodHere();

} // End for


} // End main()



Hope that isn't way more than you want. But, at the very least, you now have an idea of what a map is!

If you are dead set on a vector, then you can do something like this:




#include <vector>
#include <string>
#include <cctype> // toupper() function

// Typedef your BST thing to MyBST here

int main(int argc, char **argv){

vector<MyBST> my_bst(26); // Vector for the object
vector<MyBST>::iterator bst_it; // Iterator to loop through the vector
vector<string> words; // Holder for all the parsed words
vector<string>::iterator it; // Iterator to loop through the words

// Parse the words into the vector here

// Loop through the words and insert them...
for(it = words.begin(); it != words.end(); it++)
my_bst[toupper(*it[0]) - 'A'].YourInsertMethod(*it);

// Loop through the vector and do something...
for(bst_it = my_bst.begin(); bst_it != my_bst.end(); bst_it++){

*bst_it.YourMethodHere();

} // End for


} // End main()



Notice we are using the expression ' toupper(first character of word) - 'A' '. This is a quick way to convert A to 0, B to 1, ... Z to 25, by subtracting the ASCII value of 'A' from the first character in uppercase form. After all, a char is really just a short int--so, you can do integer math on it. Nifty, huh?

In either scenerio, it'll take the roughly the same amount of code to do what you want. However, the map fits a little better into the logic that you presented.

Two things I should mention about vectors, that you might be interested in. Little "F.Y.I." things that really only matter to those of us who optimize their programs.

First, you were correct in your assumption on how to set the initialized capacity. However, you should be aware that once you set a value in the vector the actual capacity could very well be around 85 cells or more.

Vectors and lists work kinda weird. They have predefined reserve "blocks", once you exceed a block, they add another block. Let's assume there is a reserve size of 256, the first item you insert creates storage for 256 items, when you insert the 256th item your storage size increases to 512 items, and so forth. The default size is compiler/library dependent (the 256 is the size for a int in the RogueWave library). Keep in mind, this is "un-indexed" reserved space on the heap--if you insert 4 items (in cells 0-3) and try to access cell 200, you will get a segmentation fault (coredump). You can set the reserve size manually (example: 'myvec.reserve(100);' will create 100 item blocks). More is better, as creating blocks causes overhead that slows execution speed.

Second, vectors and lists are very similar in use. However, vectors are faster when dealing with a small amount of simple classes as items or a small amount of types as items ("small" amounts would be less than say 15 million items). For large amounts, or complex classes, a list is much faster.

Orphious
12-07-01, 08:34 PM
Ya it does have to be a vector. I think I understand what you are saying, though I did get a little lost on some points. Let me ask real quick, the form to insert in the version I am using is like this.
BST<string> stringBST;
stringBST.insert(data);
That will insert the data into the binary tree. How would I incorporate that in the version you gave me?

O one more thing, How can I read in the .txt file and make sure it overlooks periods? Like tell it to not input the . at the end of a sentence. Thanks

Orphious
12-08-01, 12:31 PM
let me say how this works as of now.

you declare the class like so,
BST<string> stringBST;
that will construct an empty tree named stringBST.
then you insert via stringBST.insert(value); etc.....
What I am getting confused about is where the vector of size 26 fits in. I mean I understand the purpose, cell[0] is for A, etc..... but how could that then tie into the tree. Would the cell[0] point to the root node of the accompaning tree, does the tree reside in cell[0], I think I would understand it a little better if I knew how that part worked.

Stu
12-08-01, 02:09 PM
Let's take the second question first. You have a tree for each letter (there is an A tree, B tree, ... , Z tree). So, that's where the vector comes in. Also, when you insert the value into the tree, you might want to uppercase it--because 'A' != 'a' and they will most likely not get sorted right if the cases aren't consistant.

On to the first question, as for getting rid of punctuation and the like. I showed you how to do that in the answer to your Palindrome question (http://forums.speedguide.net/showthread.php?s=&threadid=54905) a couple months back. As to reading in the file, I'd read the entire thing into a string through a stringstream buffer. Use a function like this:




#include <string>
#include <fstream>
#include <sstream>

string read_file(string &);
string read_file(string & filename){

stringstream buf; // Buffer to transfer from a stream to a string
ifstream file(filename.c_str()); // File handle

if(!file){

cerr << "\nFile did not open!\n" << ends;
string error;
return error;

} // End if

// Dump the entire file into the buffer...
buf << file.rdbuf();

// Return the string...
return buf.str();

} // End read_file()

Orphious
12-08-01, 02:32 PM
for(it = words.begin(); it != words.end(); it++)
my_bst[toupper(*it[0]) - 'A'].YourInsertMethod(*it);


How does my insert fit into what you gave. my_bst[toupper(*it[0]) - 'A'].stringBST.insert(*it); doesnt seem to work. Is there a way that I am not thinking of? thanks

Stu
12-08-01, 03:37 PM
my_bst[toupper(*it[0]) - 'A'] is an individual stringBST. So, what you are trying is basically the same as stringBST.stringBST.insert(*it)...which isn't right.

Orphious
12-08-01, 05:00 PM
I see now, that was the whole point with the typedef BST mybst and vector<mybst> my_bst, etc..... I got it.

Orphious
12-08-01, 05:46 PM
does toupper() only work with a character? Like could I change a whole string via toupper(stringinput) or does that not work. I tried it and spit some errors at me, so I'm guessing that it doesn't. Would the way to do it be looping throw the string and change each one? Or is there a better way?

--------------------------------
Nevermind, I got it.