Pascal Schlaak | Project 4

Who is the most active group member?

In my freetime I wanted to analyze an old WhatsApp group chat and compare it with its new one to see things like difference in activity. WhatsApp offers functionality to download group chats as .txt files. I wrote a Python script to read group chat content in data structures and analyze all messages on things I was interested in. Following algorithm was build to fetch content of every participants message and split it into datetime, sender and message.

    # Add all senders by name to name array
    for arg in argv:

    # Split chat_data back into messages
    splitted_chat_data = raw_chat_data.split("[")
    # Iterate through messages
    for element in splitted_chat_data:
        # Reset variables after every message
        datetime, sender, message = "", "", ""
        # Strip unnecessary characters
        element = element.strip("\r\n")
        # Split message into date and content
        item = element.split("]")
        # Get date
        datetime = item[0]
        # Check if message was writted by name of a sender which 
        # should be analysed
        for name in name_set:
            # Get sender and content (equals last element in item array)
            content = item[-1]
            if name in content:
                content = content.split(name)
                sender = name
                content = content[-1]
                message = content[2:]
        chat_messages[message_number] = {"datetime": datetime, 
        "sender":sender, "message":message}
        message_number += 1
    print("\nNumber of messages:\t%d\n" % message_number)
    return chat_messages, name_set

For example I was interested in the number of messages every participant was writing and how many words he used in his messages, because there are some participants that are relatively quit.

Number of words and messages by group member

I also wanted to know, which time is the best time to ask questions depending on the activity of all participants. I often found myself in situations where I asked a question and waited a long period of time since nobody was active.

Activity by number of messages at specific daytime

I got some very interesting results like in the plots above. You can find the complete code on my Github repository!




Python, Matplotlib




June 27, 2019