Menu
Forums
All threads
Latest threads
New posts
Trending threads
New posts
Search forums
Trending
What's new
New posts
New profile posts
Latest activity
Members
Current visitors
New profile posts
Search profile posts
Upgrades
Log in
Register
What's new
Search
Search
Search titles only
By:
All threads
Latest threads
New posts
Trending threads
New posts
Search forums
Menu
Log in
Register
Navigation
Install the app
Install
More options
Contact us
Close Menu
Forums
Software Development
Programming
Programming Q&A
[Python] {Selenium} How Do I get Selenium to iterate through elements with the same class name?
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="percocet" data-source="post: 402156" data-attributes="member: 71898"><p>I'm trying to make a python app that extracts all of the youtube titles of a youtube channel's videos.</p><p></p><p>I'm currently attempting to do it using selenium.</p><p></p><p>[code]</p><p>def getVideoTitles():</p><p> driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver")</p><p> driver.get(googleYoutubePage())</p><p></p><p> titleElement = driver.find_element_by_class_name("yt-lockup-content")</p><p> print(titleElement.text) #it prints out title, + views, hours ago, and "CC"</p><p> #I suck at selenium so lets just store the title and cut everything after it</p><p>[/code]</p><p></p><p>The class_name yt-lockup-content is the class name for each video on a youtube channel's /videos page. In the code above I am able to get the title for the first youtube video on that page. But I want to iterate through all of the youtube titles (in other words, I want to iterate through every single yt-lockup-content element) in order to store the .text (which is the title of the video)</p><p></p><p>But I was wondering how do I access the yt-lockup-content[2] persay. Which in other words would be the second video on that page, that has the same class name. Because each youtube video has the same class name.</p><p></p><p>Here is my full code. Play with it if you'd like.</p><p>Cheers,</p><p></p><p>[code]</p><p>'''</p><p></p><p>'''</p><p>import selenium</p><p>from selenium import webdriver</p><p></p><p>def getChannelName():</p><p> print("Please enter the channel that you would like to scrape video titles...")</p><p> channelName = input()</p><p> googleSearch = "https://www.google.ca/search?q=%s+youtube&oq=%s+youtube&aqs=chrome..69i57j0l5.2898j0j4&sourceid=chrome&ie=UTF-8#q=%s+youtube&*" %(channelName, channelName, channelName)</p><p> print(googleSearch)</p><p> return googleSearch</p><p></p><p>def googleYoutubePage():</p><p> driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver")</p><p> driver.get(getChannelName())</p><p> element = driver.find_element_by_class_name("s") #this is where the link to the proper youtube page lives</p><p> keys = element.text #this grabs the link to the youtube page + other crap that will be cut</p><p></p><p> splitKeys = keys.split(" ") #this needs to be split, because aside from the link it grabs the page description, which we need to truncate</p><p> linkToPage = splitKeys[0] #this is where the link lives</p><p></p><p> for index, char in enumerate(linkToPage): #this loops over the link to find where the stuff beside the link begins (which is unecessary)</p><p> if char == "\n":</p><p> extraCrapStartsHere = index #it starts here, we know everything beyond here can be cut</p><p></p><p></p><p> link = ""</p><p> for i in range(extraCrapStartsHere): #the offical link will be everything in the linkToPage up to where we found suitable to cut</p><p> link = link + linkToPage[i]</p><p></p><p> videosPage = link + "/videos"</p><p> print(videosPage)</p><p> return videosPage</p><p></p><p>def getVideoTitles():</p><p> driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver")</p><p> driver.get(googleYoutubePage())</p><p></p><p> titleElement = driver.find_element_by_class_name("yt-lockup-content")</p><p> print(titleElement.text) #it prints out title, + views, hours ago, and "CC"</p><p> #I suck at selenium so lets just store the title and cut everything after it</p><p></p><p></p><p>def main():</p><p> getVideoTitles()</p><p></p><p>main()</p><p>[/code]</p><p>[doublepost=1488486148,1488428358][/doublepost]Thanks for everyone that may have tried to answer this question.</p><p></p><p>The answer lied mainly in finding the right element that held all of the title names. Which was a lot more work than it seems, considering how obfuscated youtube's web page.</p><p></p><p>What I had to do was loop through every element, like so</p><p>[code]</p><p>while driver.find_element_by_class_name("yt-uix-button") is not False:</p><p> for title in driver.find_elements_by_class_name("yt-uix-tile-link"):</p><p> print(title.text)</p><p>[/code]</p><p></p><p>That line of coded is added in my getVideoTitles function in replacement of the titleElement variable init.</p><p></p><p>Cheers,</p><p>[doublepost=1488486171][/doublepost]Thread can be closed by mods due to answer being found</p></blockquote><p></p>
[QUOTE="percocet, post: 402156, member: 71898"] I'm trying to make a python app that extracts all of the youtube titles of a youtube channel's videos. I'm currently attempting to do it using selenium. [code] def getVideoTitles(): driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver") driver.get(googleYoutubePage()) titleElement = driver.find_element_by_class_name("yt-lockup-content") print(titleElement.text) #it prints out title, + views, hours ago, and "CC" #I suck at selenium so lets just store the title and cut everything after it [/code] The class_name yt-lockup-content is the class name for each video on a youtube channel's /videos page. In the code above I am able to get the title for the first youtube video on that page. But I want to iterate through all of the youtube titles (in other words, I want to iterate through every single yt-lockup-content element) in order to store the .text (which is the title of the video) But I was wondering how do I access the yt-lockup-content[2] persay. Which in other words would be the second video on that page, that has the same class name. Because each youtube video has the same class name. Here is my full code. Play with it if you'd like. Cheers, [code] ''' ''' import selenium from selenium import webdriver def getChannelName(): print("Please enter the channel that you would like to scrape video titles...") channelName = input() googleSearch = "https://www.google.ca/search?q=%s+youtube&oq=%s+youtube&aqs=chrome..69i57j0l5.2898j0j4&sourceid=chrome&ie=UTF-8#q=%s+youtube&*" %(channelName, channelName, channelName) print(googleSearch) return googleSearch def googleYoutubePage(): driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver") driver.get(getChannelName()) element = driver.find_element_by_class_name("s") #this is where the link to the proper youtube page lives keys = element.text #this grabs the link to the youtube page + other crap that will be cut splitKeys = keys.split(" ") #this needs to be split, because aside from the link it grabs the page description, which we need to truncate linkToPage = splitKeys[0] #this is where the link lives for index, char in enumerate(linkToPage): #this loops over the link to find where the stuff beside the link begins (which is unecessary) if char == "\n": extraCrapStartsHere = index #it starts here, we know everything beyond here can be cut link = "" for i in range(extraCrapStartsHere): #the offical link will be everything in the linkToPage up to where we found suitable to cut link = link + linkToPage[i] videosPage = link + "/videos" print(videosPage) return videosPage def getVideoTitles(): driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver") driver.get(googleYoutubePage()) titleElement = driver.find_element_by_class_name("yt-lockup-content") print(titleElement.text) #it prints out title, + views, hours ago, and "CC" #I suck at selenium so lets just store the title and cut everything after it def main(): getVideoTitles() main() [/code] [doublepost=1488486148,1488428358][/doublepost]Thanks for everyone that may have tried to answer this question. The answer lied mainly in finding the right element that held all of the title names. Which was a lot more work than it seems, considering how obfuscated youtube's web page. What I had to do was loop through every element, like so [code] while driver.find_element_by_class_name("yt-uix-button") is not False: for title in driver.find_elements_by_class_name("yt-uix-tile-link"): print(title.text) [/code] That line of coded is added in my getVideoTitles function in replacement of the titleElement variable init. Cheers, [doublepost=1488486171][/doublepost]Thread can be closed by mods due to answer being found [/QUOTE]
Insert quotes…
Verification
Post reply
Forums
Software Development
Programming
Programming Q&A
[Python] {Selenium} How Do I get Selenium to iterate through elements with the same class name?
Top