In my freetime I wanted to analyze an old WhatsApp
group chat and
compare it with its new one to see things like
difference in activity. WhatsApp
offers functionality to download group chats
as .txt
files. I wrote
a Python
script to read group chat content in data structures and analyze all
messages on things I was interested in.
Following algorithm was build to fetch content of every participants message and split it
into datetime
, sender
and message
.
...
# Add all senders by name to name array
for arg in argv:
name_set.add(arg.lower())
# Split chat_data back into messages
splitted_chat_data = raw_chat_data.split("[")
# Iterate through messages
for element in splitted_chat_data:
# Reset variables after every message
datetime, sender, message = "", "", ""
# Strip unnecessary characters
element = element.strip("\r\n")
# Split message into date and content
item = element.split("]")
# Get date
datetime = item[0]
# Check if message was writted by name of a sender which
# should be analysed
for name in name_set:
# Get sender and content (equals last element in item array)
content = item[-1]
if name in content:
content = content.split(name)
sender = name
content = content[-1]
message = content[2:]
break
chat_messages[message_number] = {"datetime": datetime,
"sender":sender, "message":message}
message_number += 1
print("\nNumber of messages:\t%d\n" % message_number)
return chat_messages, name_set
...
For example I was interested in the number of messages every participant was writing and how many words he used in his messages, because there are some participants that are relatively quit.
Number of words and messages by group member
I also wanted to know, which time is the best time to ask questions depending on the activity of all participants. I often found myself in situations where I asked a question and waited a long period of time since nobody was active.
Activity by number of messages at specific daytime
I got some very interesting results like in the plots above. You can find the complete code on my Github repository!