I know there is no relation between a Scraper and Pyramid but when you click the new Pyramid icon you'll see the Scraper module.
Scraper is a programmable tool that can grab information from internet, organize them and create results. This release contains two sample functions. One of them can find
myspace profiles from e-mail addresses and the other one can find
stickam profiles. All you need to do is pasting your e-mail address list to
Value List box, choosing a
Function and clicking
Start. Scraper will search Myspace for all e-mail addresses one by one and display results.
You can doubleclick to a result to display the discovered profile. If you want you can create a list of profile URL's or E-mail | Profile URL lists by clicking "Generate Lists" toolbar icon.
If you want to search Stickam profiles you can change Function to "Stickam Profile Finder". You have the option of choosing "ALL" instead of a single function from Functions combo box. In this case Scraper will run both Myspace and Stickam profile finders together. Scraper will append results as soon as it gets results so the order of the results might change. If you click "Sort by value list order" button you'll see the Myspace result and the Stickam result together for a single e-mail address. It will make things easier if you'll try generating a list of results.
Advanced Scraper Functions
You can use Scraper to discover profiles from e-mail addresses but this is not the only thing Scraper can do. It is possible to change how Scraper works but it needs some knowledge of HTML and Regular Expressions. This part of this article is for people who knows these.
Did you noticed Search Patterns? Scraper works similar but more flexible than Search Patterns. Basically Scraper downloads HTML, strips information with Regular Expressions and displays results. You can create your own functions to change Scraper behavior.
This is the sample setup for Myspace Profile Finder Function. Creating a function is very easy once you get the basics. Scraper runs a function for every supplied value one by one. Users supplies values with Value List Box. I've created a table for all settings.
| Name | The name of the function |
| Category | The Category of the function |
| Accept Pattern | If you enter a regex pattern here function runs if the supplied value matches to this pattern. Since our example needs e-mail addresses as values I've entered @ as the accept pattern. Only values that contains @ char will start this functions. You can leave it empty if you do not need value validation. |
| Query | The format used to generate URL's. When a functions starts Scraper creates an URL by using this format. Query format contains a {q} replacement tag. Scraper replaces {q} with the supplied value to be able to build URL. Our example query is the Myspace Search Page URL. Scraper will replace {q} part with a supplied e-mail address value to be able to create an URL. And then it will download the HTML source from this address. |
| Main Pattern | This Regex Pattern used to match values from HTML source. Our example pattern tries to find the "msProfileLink" <span> tag to be able to fetch profile URL. If this pattern matches it means a profile found with the supplied e-mail address. Our RegEx fetches two values. The title tag of the anchor for the album name and the href tag of the anchor for the album URL. We'll use this values to generate results. |
| Result Format | Scraper generates two results. One of them is the result itself and the other one is the result displayed to user. If the generated result is an URL Scraper will Navigate to that URL when the result double clicked. In our example the result of the function is the profile URL. Profile URL the is the second value of the Main Regex Pattern so we can grab this value with $2 tag. $1 refers to the album title. |
| Display Format | We'll create a format here to display the result. Profile URL will be enough for us but instead of just displaying the profile URL we can display a better result. Again {q} will replace with the supplied value and $1 $2 tags will replace with the Main Regex results. In our example {q} will be the supplied e-mail address value and $1 will be the profile title and $2 will be the profile URL. |
| Fail Message | If Main Regex Pattern does not match any value, function fails. In this case we can display a message or just skip it by unchecking Display Failed Items check box. |
| Final action when a result found. | This is the most amazing part of Scraper. Instead of displaying the result you can call another function supplied with the generated value. For example instead of displaying Album URL you can pass the Album URL to another function for different things. We call these second level functions Sub Functions. Scraper does not display these functions to user but you can call these functions. Scraper contains an example Sub Function. You can enable it by choosing "Call a Function" radiobox and writing the subfunction name there. Do not write the same Function Name of this Function here or it will create an endless loop. The example sub function downloads the supplied Myspace profile URL and extracts the photo album URL if exists. When you enable this sub function Scraper will display Album URL's of the found profiles instead of Profile URL's. Two functions works together here. One of them finds the profile URL and the other one extract the Album URL. All myspace profiles does not have Album Links so in this case Scraper will display profiles which have Album URL's.
There is another sample function called "Myspace Female Profile Finder." This function is disabled by default so users will not see this function in Functions ComboBox of the console UI. You can enable it and see how Functions and Category Comboxes works. This function will display the myspace profile if supplied e-mail address is attached to a profile of a Female. The only difference is with the first Myspace Profile Finder is the &g=F parameter used to filter Female profiles from search results. I've added this function to demonstrate different usages of functions. You can attach the "Myspace Album Finder" subfunction to this function too. In this case Scraper will display only the Album links of Females. |
If you add custom functions you won't loose them during upgrades. NG handles data files of this tool differently. You can also Import / Export custom functions by using the command buttons located under Functions List box. Thats all for now. If you are interested on this tool please share your ideas in our forum and help me to add more possibilities for creating different types of functions.