[Python] {Selenium} How Do I get Selenium to iterate through elements with the same class name?

percocet

Member
Oct 21, 2016
72
16
I'm trying to make a python app that extracts all of the youtube titles of a youtube channel's videos.

I'm currently attempting to do it using selenium.

Code:
def getVideoTitles():
   driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver")
   driver.get(googleYoutubePage())

   titleElement = driver.find_element_by_class_name("yt-lockup-content")
   print(titleElement.text) #it prints out title, + views, hours ago, and "CC"
    #I suck at selenium so lets just store the title and cut everything after it

The class_name yt-lockup-content is the class name for each video on a youtube channel's /videos page. In the code above I am able to get the title for the first youtube video on that page. But I want to iterate through all of the youtube titles (in other words, I want to iterate through every single yt-lockup-content element) in order to store the .text (which is the title of the video)

But I was wondering how do I access the yt-lockup-content[2] persay. Which in other words would be the second video on that page, that has the same class name. Because each youtube video has the same class name.

Here is my full code. Play with it if you'd like.
Cheers,

Code:
'''

'''
import selenium
from selenium import webdriver

def getChannelName():
    print("Please enter the channel that you would like to scrape video titles...")
    channelName = input()
    googleSearch = "https://www.google.ca/search?q=%s+youtube&oq=%s+youtube&aqs=chrome..69i57j0l5.2898j0j4&sourceid=chrome&ie=UTF-8#q=%s+youtube&*" %(channelName, channelName, channelName)
    print(googleSearch)
    return googleSearch

def googleYoutubePage():
    driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver")
    driver.get(getChannelName())
    element = driver.find_element_by_class_name("s") #this is where the link to the proper youtube page lives
    keys = element.text #this grabs the link to the youtube page + other crap that will be cut

    splitKeys = keys.split(" ") #this needs to be split, because aside from the link it grabs the page description, which we need to truncate
    linkToPage = splitKeys[0] #this is where the link lives

    for index, char in enumerate(linkToPage): #this loops over the link to find where the stuff beside the link begins (which is unecessary)
        if char == "\n":
            extraCrapStartsHere = index #it starts here, we know everything beyond here can be cut


    link = ""
    for i in range(extraCrapStartsHere): #the offical link will be everything in the linkToPage up to where we found suitable to cut
        link = link + linkToPage[i]

    videosPage = link + "/videos"
    print(videosPage)
    return videosPage

def getVideoTitles():
    driver = webdriver.Chrome("/Users/{username}/PycharmProjects/YoutubeChannelVideos/chromedriver")
    driver.get(googleYoutubePage())

    titleElement = driver.find_element_by_class_name("yt-lockup-content")
    print(titleElement.text) #it prints out title, + views, hours ago, and "CC"
                            #I suck at selenium so lets just store the title and cut everything after it


def main():
    getVideoTitles()

main()
 
Thanks for everyone that may have tried to answer this question.

The answer lied mainly in finding the right element that held all of the title names. Which was a lot more work than it seems, considering how obfuscated youtube's web page.

What I had to do was loop through every element, like so
Code:
while driver.find_element_by_class_name("yt-uix-button") is not False:
    for title in driver.find_elements_by_class_name("yt-uix-tile-link"):
        print(title.text)

That line of coded is added in my getVideoTitles function in replacement of the titleElement variable init.

Cheers,
 
Thread can be closed by mods due to answer being found
 

Users who are viewing this thread

Top