Return to the first card of the stack and find your way to the native controls part of the MobGUI window. The following steps will guide you through it:
Page
.url
and resize it to be nearly as wide as the card, leaving space for the Go button on the right.Go
, and resize it to look nice.on MouseUp mobileControlSet "Page", "url", the mgText of group "url" focus on nothing end mouseUp
init
message to the cards. For the Browser card, we can use this as a way to restore the previously chosen web page. Add the following to the Browser card script:on init global gPageURL if gPageURL is not empty then set the pURL of group "Page" to gPageURL else set the pURL of group "Page" to "http://www.google.com/" end if end init
Page
) control. We're going to use the browserFinishedLoading
message to know when to update some variables and URL text.on browserFinishedLoading pURL,pType global gPageURL,gPageHTML put pURL into gPageURL put url pURL into gPageHTML set the mgText of group "url" to pURL pass browserFinishedLoading end browserFinishedLoading
Setting the pURL
command of the browser control to mgText
was enough to make the browser function, but some of what was just done was in preparation for what we'll need in the other cards. In particular, we used the regular LiveCode put url
command to stash a copy of the web page HTML code in a global variable and this will be needed when we start extracting links and media from the page.
The Links, Text, and Media cards will take the page source that is stored in the
gPageHTML
global variable and extract the bits of interest from it. How will they do that?
A common approach while extracting a known pattern of text is to use regular expressions, which are often referred to as regex
or regexp
. At it's the simplest approach, it's easy to understand, but can get quite complex.
Read the Wikipedia article if you want to understand about a regular expression in depth at:
http://en.wikipedia.org/wiki/Regular_expression
Another useful source of information is this Packt Publishing article on regular expressions, which you can find at http://www.packtpub.com/article/regular-expressions-python-26-text-processing.
One problem though, is that using regexp to parse HTML content is frowned upon. There are scores of articles online telling you outright NOT TO parse HTML with regexp! Here's one pithy example at
http://boingboing.net/2011/11/24/why-you-shouldnt-parse-html.html.
Now, parsing an HTML source is exactly what we want to do here and one solution to the problem is to mix and match using LiveCode's other text matching and filtering abilities to do most of the work. Although it's not exactly regexp, LiveCode can use regular expressions in some of its matching and filtering functions and they are somewhat easier to understand than full-blown regexp. So, let's begin by using these.
While looking for links, we will make the assumption that the link is inside an a href
tag, but even then, there are a lot of variations of how that can appear. The general structure of an href
tag is like this:
<a href="http://www.runrev.com/support/forum/">Link text that the user will see</a>
In the text of the web page will be the phrase Link text that the user will see
. When this is pointed at by the mouse, the user will see the pointing finger cursor, and when it's clicked on, the page will reload using the URL shown in the href
part of the tag.
The preceding example shows the full path to the support forum; here are the ways that the very same web location might be written as in a page link:
http://www.runrev.com/support/forum/
/support/forum/
support/forum/
../support/forum/
The first link will take you there no matter where you are at that time. The second will take you there if you're somewhere else on the http://runrev.com/ site. The third will be correct while you are at the root level of http://runrev.com/, and the last example would work from within one of the other root-level directories on the site.
With regex, you might create an extravagant expression that deals with all possible variations of how the links are contained in the page source, but even then it would not give us the full paths we need.
By taking things slowly, we can reduce the whole page source to a set of lines of "a href" entries, extract the URL part of each line, and finally, take the preceding variations and convert them into full path URLs.