Time for action – adding the browser controls

Return to the first card of the stack and find your way to the native controls part of the MobGUI window. The following steps will guide you through it:

  1. Drag the Browser control on the card window.
  2. Resize the control to fill the width of the card and resize the control so that its height fits between the tab bar and a little way below the NavBar. Give it the name Page.
  3. With the browser control selected, make sure that the box in the MobGUI window titled Auto delete is checked. This will help reduce the memory usage of the final app during the times you're not on the browser card.
  4. From the MobGUI window, drag an Input control into the gap between the browser control and the NavBar. Name it url and resize it to be nearly as wide as the card, leaving space for the Go button on the right.
  5. Drag a Button control into that space, set its label to Go, and resize it to look nice.
  6. Edit the script of the Go button (which as you may notice, is really a group) and add a couple of lines in the mouseUp handler, as follows:
    on MouseUp
      mobileControlSet "Page", "url", the mgText of group "url"
      focus on nothing
    end mouseUp
  7. Later, we will send an init message to the cards. For the Browser card, we can use this as a way to restore the previously chosen web page. Add the following to the Browser card script:
    on init
      global gPageURL
      if gPageURL is not empty then
        set the pURL of group "Page" to gPageURL
      else
        set the pURL of group "Page" to "http://www.google.com/"
      end if
    end init
  8. Edit the script of the browser (group Page) control. We're going to use the browserFinishedLoading message to know when to update some variables and URL text.
  9. Modify this handler of the browser control's script, shown as follows:
    on browserFinishedLoading pURL,pType
      global gPageURL,gPageHTML
      put pURL into gPageURL
      put url pURL into gPageHTML
      set the mgText of group "url" to pURL
      pass browserFinishedLoading
    end browserFinishedLoading
  10. Save and perform another Test to see the browser card in action.

What just happened?

Setting the pURL command of the browser control to mgText was enough to make the browser function, but some of what was just done was in preparation for what we'll need in the other cards. In particular, we used the regular LiveCode put url command to stash a copy of the web page HTML code in a global variable and this will be needed when we start extracting links and media from the page.

The Links card

The Links, Text, and Media cards will take the page source that is stored in the gPageHTML global variable and extract the bits of interest from it. How will they do that?

A common approach while extracting a known pattern of text is to use regular expressions, which are often referred to as regex or regexp. At it's the simplest approach, it's easy to understand, but can get quite complex.

Note

Read the Wikipedia article if you want to understand about a regular expression in depth at:

http://en.wikipedia.org/wiki/Regular_expression

Another useful source of information is this Packt Publishing article on regular expressions, which you can find at http://www.packtpub.com/article/regular-expressions-python-26-text-processing.

One problem though, is that using regexp to parse HTML content is frowned upon. There are scores of articles online telling you outright NOT TO parse HTML with regexp! Here's one pithy example at

http://boingboing.net/2011/11/24/why-you-shouldnt-parse-html.html.

Now, parsing an HTML source is exactly what we want to do here and one solution to the problem is to mix and match using LiveCode's other text matching and filtering abilities to do most of the work. Although it's not exactly regexp, LiveCode can use regular expressions in some of its matching and filtering functions and they are somewhat easier to understand than full-blown regexp. So, let's begin by using these.

While looking for links, we will make the assumption that the link is inside an a href tag, but even then, there are a lot of variations of how that can appear. The general structure of an href tag is like this:

<a href="http://www.runrev.com/support/forum/">Link text that the user will see</a>

In the text of the web page will be the phrase Link text that the user will see. When this is pointed at by the mouse, the user will see the pointing finger cursor, and when it's clicked on, the page will reload using the URL shown in the href part of the tag.

The preceding example shows the full path to the support forum; here are the ways that the very same web location might be written as in a page link:

http://www.runrev.com/support/forum/

/support/forum/

support/forum/

../support/forum/

The first link will take you there no matter where you are at that time. The second will take you there if you're somewhere else on the http://runrev.com/ site. The third will be correct while you are at the root level of http://runrev.com/, and the last example would work from within one of the other root-level directories on the site.

With regex, you might create an extravagant expression that deals with all possible variations of how the links are contained in the page source, but even then it would not give us the full paths we need.

By taking things slowly, we can reduce the whole page source to a set of lines of "a href" entries, extract the URL part of each line, and finally, take the preceding variations and convert them into full path URLs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset