Investigating America’s Most Reputable Companies

STA/ISS 313 - Project 1


Team Six


Our project aims to investigate the different factors that contribute to customer perceptions of notable company reputations. We’ve interacted with many of these companies before, yet our familiarity with these entities is limited solely to their name recognition. In an effort to expand our knowledge and gain a more comprehensive understanding of their respective positions, we aim to delve deeply into the available data from the 2022 Axios-Harris Poll which contains information concerning company scores for specific attribute categories in 2022, as well as how these scores and overall rankings have changed over time. By investigating differences in scores between industries, the relationship between attribute categories, and changes in company scores over time, we hope to achieve better insight on the intricacies of what may influence company reputations.


The dataset that we are using comes from the TidyTuesday project that uses data from the Axis-Harris Poll, which investigated the reputation of the most visible brands in America. The Harris Poll conducted a survey in February 2022 among a representative sample of the American population to identify the companies that were most prominent in the public’s mind. The top 100 companies with the highest number of nominations were included in the “Most Visible” list. The poll actually resulted in two datsets: `polls` and `reputation`. The polls dataset has 8 variables and 500 observations. This dataset has each company’s industry, overall ranking, and RQ score in 2022 as well as information about each company’s rating from 2017 to 2021. RQ scores are a metric that is a combination of each company’s rating for each specific attribute. The attributes are: trust, ethics, growth, p&s, citizenship, vision, and culture. It is specifically calculated using the formula:  [ (Sum of ratings of each of the 9 attributes)/(the total number of attributes answered x 7) ] x 100. Additionally, score ranges are: 80 & above: Excellent | 75-79: Very Good | 70-74: Good | 65-69: Fair | 55-64: Poor | 50-54: Very Poor | Below 50: Critical. Next, the reputation dataset has 10 variables and 700 rows or observations. This dataset is only from the year 2022, and shows the individual breakdown for how companies were scored and ranked based on the 7 attributes mentioned previously1. This dataset is important for illustrating the specific attribute categories that different companies and their industries scored well or poorly on.

Question 1

How do customer ratings of company attributes vary by industry and how do these attributes relate to one another?


Our first question aims to explore the different attribute scores that customers gave these notable companies in 2022, and how these scores may vary by industry. We are interested in investigating the differences behind these attribute scores because industries usually have diverse policies and mission statements that cause them to place emphasis on different company aspects.  Consequently, we are interested in examining which industries scored the highest for each of the 7 attribute categories, how the distribution of these scores differ, and if there is any overlap in industries between the different attributes. Additionally, we decided to go one layer deeper and examine the relationship between these specific company attributes. We thought it would be interesting to see how customer perceptions of these attributes affected one another and if high or low scores in one category would lead to differences or similarities in other categories. To answer these questions, we need to examine information from the `reputation` dataset. Specifically, we will be looking at the `name` variable which represents company attribute, `score` which is the score given by customers for each attribute, and `industry` which signifies the specific industry that companies fall under.


We choose comparative boxplots for our first visualization of question 1 to see what attributes in 2022 consumers valued in companies across different industries. Each boxplot represents the industry that scored highest, on average, for that attribute. A side-by-side comparison of these boxplots allows us to analyze the similarities and differences in the center and spread of each attribute. This allows us to answer the question of what were the most important attributes to consumers and how industry affected that question. For our second visualization, we constructed a correlation matrix of all seven attributes in 2022. While our first visualization gave us a broader understanding of attributes consumers valued, we wanted to dive deeper on if consumers were likely to give similar ratings across different attributes. To answer this question, we chose a correlation matrix because it shows the correlations between all seven attributes in one visualization. Also, p-values can be added to the visualization to show if any two attributes have a significant relationship. This visualization shows if a consumer’s rating for one attribute was likely to predict his or her rating for a different attribute.


Visualization #1

Visualization #2


The first visualization that we created to answer this question was a multiple boxplot graph of the highest scoring industries for each of the 7 attribute categories. The visualization reveals overlap for many of the attributes, with the most prevalent industry being “Groceries,” which was the highest scoring industry in the citizenship, ethics, growth, and trust categories. The second most prevalent industry was “Logistics,” which scored the highest in the culture and vision attribute categories. Lastly, the “Industrial” industry scored the highest in the P&S (product and service) category. The relatively straightforward nature of grocery stores most likely accounts for “Groceries” being the highest scoring industry. It is typically hard to find anything wrong with the way in which grocery stores conduct their businesses, which could explain why they scored so high in many of the categories. Grocery stores are also reliable and trustworthy in terms of fulfilling their mission–supplying individuals with food–causing individuals to have more positive feelings about this industry as opposed to industries such as the technology or financial services that might generate more polarizing opinions. Additionally, the boxplots also illustrate the spreads of the attribute scores for their corresponding industry. The “Groceries’’ ethics scores had the greatest variability in values. Since ethics is usually very subjective in nature and depends on the distinctive experiences of individuals, this could explain the wider range of score values. Comparatively, attributes such as”Trust” and “Vision” had smaller spreads, meaning the score values were much more similar. This could be because of the more objective nature of what trust and company vision mean for customers. Lastly, it is important to note that the attribute scores were highest, on average, for the industrial industry in the “P&S’’ category and lowest for the grocery industry in the”Citizenship” category. This is probably because Product and Service is usually impartial in nature since a company that produces useful and functional products will generate higher scores from customers because it meets the goals that it markets. However, social responsibilities (company citizenship) is probably not something that companies will focus on tremendously, and even if they do, it may not be something that individuals are the most aware about.

The correlation matrix shows how any two company attributes in 2022 were strongly correlated with each other, though not all had significant relationships. All of the pairs of attributes have correlations of R > 0.80, so they are all strongly or very strongly positively correlated. Simply put, this means that a consumer’s high rating for any attribute likely correlated with a high rating for all other attributes. There are two reasons why this trend makes sense. First, a consumer’s general feeling of a company probably causes them to rate attributes similarly. Second, there is overlap in some of the attributes such as “Citizenship,” “Trust,” and “Ethics,” which all relate to if consumers see a company as socially responsible. This point also potentially explains why some correlations are stronger than others. Two attributes like “Trust” and “Growth” are more different ideas than “Trust” and “Citizenship.” However, despite the commonality of strong correlations, not all attributes significantly predict other attributes. Only 10 of the 21 attribute pairs are significantly related. Thus, considering the p-values of the attribute relationships shows that some of the strong correlations aren’t as robust as others. Furthermore, the main trend among the significant relationships is that the attribute “Culture” does not significantly predict any other attributes and is the only attribute that does so. This suggests that a consumer’s evaluation of a company’s culture does not have a predictive affect on how the consumer would rate that company on a different attribute. Since company culture relates more to the company’s workplace and less to how it interacts with consumers, it’s understandable how consumers may separate this attribute from others.

Question 2


  1. Axios. (2022, May 24). The 2022 Axios Harris Poll 100 reputation rankings. Axios. Retrieved February 22, 2023, from↩︎