Menu
Forums
All threads
Latest threads
New posts
Trending threads
New posts
Search forums
Trending
What's new
New posts
New profile posts
Latest activity
Members
Current visitors
New profile posts
Search profile posts
Upgrades
Log in
Register
What's new
Search
Search
Search titles only
By:
All threads
Latest threads
New posts
Trending threads
New posts
Search forums
Menu
Log in
Register
Navigation
Install the app
Install
More options
Contact us
Close Menu
Forums
Software Development
Programming
[PY] Data Mining Script
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="Brackson" data-source="post: 242816" data-attributes="member: 34747"><p>Current version: v1.1</p><p>[code]</p><p>#!/usr/bin/env python</p><p></p><p>import urllib2, os, time</p><p></p><p>def store_into_file():</p><p> url = 'http://google.com/' # URL that you want to mine.</p><p> data = urllib2.urlopen(url).read() # Get the HTML source of URL.</p><p></p><p> current_time = time.strftime('%H:%M:%S', time.localtime()) # Get the current time so we can use if for the txt filename.</p><p></p><p> r = open('%s.txt' % (current_time), 'w') # Create the file.</p><p> r.write(data) # Put the source in the file.</p><p> r.close() # Close the file.</p><p></p><p>def main():</p><p> store_into_file()</p><p></p><p>while True:</p><p> main()</p><p>[/code]</p><p>(<a href="http://pastebin.com/vgXh761a" target="_blank">with syntax formatting</a>)</p><p></p><p><u>PREVIOUS VERSIONS</u></p><ul> <li data-xf-list-type="ul"><a href="http://pastebin.com/Bi9fhYQU" target="_blank">v1.0</a></li> </ul><p></p><p>This script allows you to take the HTML contents of a webpage, and store it in a text file. You do not to install any modules for this (AFAIK). If you want to parse the content before it's stored into the file, you can use <a href="http://www.crummy.com/software/BeautifulSoup/" target="_blank">BeautifulSoup</a>.</p><p></p><p>This would come in handy if you were building an archive site, if you want to log a website's content just because, etc.</p><p></p><p>I made this for educational purposes, and I thought I'd just release it because I don't really need it. Thanks for viewing!</p></blockquote><p></p>
[QUOTE="Brackson, post: 242816, member: 34747"] Current version: v1.1 [code] #!/usr/bin/env python import urllib2, os, time def store_into_file(): url = 'http://google.com/' # URL that you want to mine. data = urllib2.urlopen(url).read() # Get the HTML source of URL. current_time = time.strftime('%H:%M:%S', time.localtime()) # Get the current time so we can use if for the txt filename. r = open('%s.txt' % (current_time), 'w') # Create the file. r.write(data) # Put the source in the file. r.close() # Close the file. def main(): store_into_file() while True: main() [/code] ([URL='http://pastebin.com/vgXh761a']with syntax formatting[/URL]) [U]PREVIOUS VERSIONS[/U] [LIST] [*][URL='http://pastebin.com/Bi9fhYQU']v1.0[/URL] [/LIST] This script allows you to take the HTML contents of a webpage, and store it in a text file. You do not to install any modules for this (AFAIK). If you want to parse the content before it's stored into the file, you can use [URL='http://www.crummy.com/software/BeautifulSoup/']BeautifulSoup[/URL]. This would come in handy if you were building an archive site, if you want to log a website's content just because, etc. I made this for educational purposes, and I thought I'd just release it because I don't really need it. Thanks for viewing! [/QUOTE]
Insert quotes…
Verification
Post reply
Forums
Software Development
Programming
[PY] Data Mining Script
Top