Time for action – setting up the Text card

We will start off in the test stack that you made, so that we can get the function working there before adding it to the WebScraper stack.

  1. Duplicate the button you made when extracting links. Change the function call getLinks to getText; the rest of the script can remain the same.
  2. Edit the script of the test stack and add this function:
    function getText pPageSource
      put replaceText(pPageSource,"(?:<(?P<tag>script|style)[sS]*?</(?P=tag)>)|(?:<!--[sS]*?-->)|(?:<[sS]*?>)","") into pPageSource
      replace lf with "" in pPageSource
      replace tab with " " in pPageSource
      return pPageSource
    end getText
  3. Try clicking on the button you just made. You should see your second field filled with just the text parts of the web page.
  4. Copy the function and go back to the WebScraper stack script. Paste the function there.
  5. Go to the Text card of the stack and from the MobGUI window, drag the Multiline Text control onto the card. Set its name to PageText.
  6. Resize the control to fill the area between the NavBar and the Tab-bar. You may have to use the LiveCode Inspector to modify the size if the text does not fill the field.
  7. In the MobGUI window properties for the control, uncheck the box for Editable.
  8. Edit the card script and add this init function:
    global gPageHTML
    
    on init
      if the platform is "iphone" or the platform is "android" then
        mobileControlSet "PageText","text",getText(gPageHTML)
      end if
    end init
  9. Try a Test of the app.
  10. In the Browser card, change the URL from http://google.com/ to http://runrev.com/ and click on Go.
  11. Press the Text tab button at the bottom.
  12. You should now be on the Text card and should be able to see the text elements from the web page displayed in a native scrolling text field.

What just happened?

This enormously long regular expression ran through the web page source and removed anything that was script, style, or just tag information, leaving the text parts alone. However, it would leave it with lots of spare line feed characters and tab characters, which we went on to remove using the LiveCode replace function. The final text may not be perfect, but you can use the standard mobile text features to copy parts of the text for use in other apps.

The Media card

The Media card is going to start off very much like the Links card, with an init function in the card script and a stack script function to extract the media links from the page.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset