{"id":8265,"date":"2025-05-19T06:06:02","date_gmt":"2025-05-19T06:06:02","guid":{"rendered":"https:\/\/ubiq.co\/tech-blog\/?p=8265"},"modified":"2025-05-20T07:10:26","modified_gmt":"2025-05-20T07:10:26","slug":"how-to-find-duplicates-in-python-pandas-dataframe","status":"publish","type":"post","link":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/","title":{"rendered":"How to Find Duplicates in Python Pandas Dataframe"},"content":{"rendered":"\n<p>Python Pandas is commonly used to store and analyze data. Duplicate rows or column values is one of the common problems faced during <a href=\"https:\/\/ubiq.co\/tools\/data-analysis\" target=\"_blank\" rel=\"noreferrer noopener\">data analysis<\/a>. Python developers need to find and remove duplicates from their data. You can easily do this using duplicated() function, available in every Pandas dataframe. In this article, we will learn how to find duplicates in Python pandas dataframe.<\/p>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#Why_Find_Duplicates_in_Python_Pandas_Dataframe\" >Why Find Duplicates in Python Pandas Dataframe<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#How_to_Find_Duplicates_in_Python_Pandas_Dataframe\" >How to Find Duplicates in Python Pandas Dataframe<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#1_Find_duplicates_based_on_all_columns\" >1. Find duplicates based on all columns<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#2_Find_duplicates_based_on_single_column\" >2. Find duplicates based on single column<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#3_Find_duplicates_based_on_multiple_columns\" >3. Find duplicates based on multiple columns<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#4_Get_duplicate_last_rows\" >4. Get duplicate last rows<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#5_Get_duplicates_in_sorted_order\" >5. Get duplicates in sorted order<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#Conclusion\" >Conclusion<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Why_Find_Duplicates_in_Python_Pandas_Dataframe\"><\/span>Why Find Duplicates in Python Pandas Dataframe<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>When we receive data from another source, it may already contain entirely duplicate rows of information. Sometimes, only certain columns may contain duplicate values. Data entry errors can also result in presence of duplicates in data. If you happen to <a href=\"https:\/\/ubiq.co\/tech-blog\/how-to-merge-and-join-pandas-dataframes\/\">merge or join dataframes<\/a> or datasets, then the result may contain duplicates. So Python developers will need to identify these duplicate values so that they can modify or remove them from your dataframe. This is an essential aspect of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Data_preparation\" target=\"_blank\" rel=\"noreferrer noopener\">data preparation<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_to_Find_Duplicates_in_Python_Pandas_Dataframe\"><\/span>How to Find Duplicates in Python Pandas Dataframe<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>There are several simple ways to find duplicates in Python pandas. Every Pandas dataframe supports duplicated() function that allows you to identify duplicates, for different use cases. Here is its syntax.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">DataFrame.duplicated(subset=None, keep='first')<\/pre>\n\n\n\n<p>The duplicated() function basically finds and marks all duplicate rows with a Boolean flag to indicate whether each rows is a duplicate or not. It accepts two arguments &#8211; subset and keep. Both are optional. By default duplicated() function will remove duplicates based on all columns. If you want to select rows that contain duplicate values only for certain columns, then you can specify them in subset argument. Keep argument determines which duplicate row to retain. It takes first, last and False values to keep the first, last and no rows. <\/p>\n\n\n\n<p>Let us say you have the following Pandas dataframe.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import pandas as pd<br><br>data = {'Name': ['John', 'Jane', 'John', 'Joe'],<br>        'Age': [28, 22, 28, 22],<br>        'City': ['New York', 'Paris', 'New York', 'London']}<br><br>df = pd.DataFrame(data)<br>print(df)<br><br>## output <br><br>   Name  Age      City<br>0  John   28  New York<br>1  Jane   22     Paris<br>2  John   28  New York<br>3   Joe   22    London<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"1_Find_duplicates_based_on_all_columns\"><\/span>1. Find duplicates based on all columns<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>By default, calling duplicated() function on a dataframe will remove flag duplicate rows. Here is an example.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">duplicate = df.duplicated()<br>print(duplicate)<br><br>## output<br><br>0    False<br>1    False<br>2     True<br>3    False<br>dtype: bool<\/pre>\n\n\n\n<p>In the above output, duplicated() function returns Boolean (True\/False) values for each row. True value means the row is a duplicate row, and False value means it is not a duplicate row. If you want the actual duplicate row, then you need to pass the result of duplicated() function again to the dataframe, as shown.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">duplicate = df[df.duplicated()]<br>print(duplicate)<br><br>## output<br><br>   Name  Age      City<br>2  John   28  New York<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"2_Find_duplicates_based_on_single_column\"><\/span>2. Find duplicates based on single column<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Sometimes, only a single column of dataframe may contain duplicate values. Or you may want to identify only those rows that contain duplicate values for a specific column. For this purpose, you need to specify the column name as <em>subset<\/em> argument. Here is an example to find rows with duplicate values for &#8216;Age&#8217; column.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">duplicate = df[df.duplicated(subset='Age')]<br>OR<br>duplicate = df[df.duplicated('Age')]<br><br>print(duplicate)<br><br>## output<br><br>   Name  Age      City<br>2  John   28  New York<br>3   Joe   22    London<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"3_Find_duplicates_based_on_multiple_columns\"><\/span>3. Find duplicates based on multiple columns<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Similarly, sometimes you may need to find duplicates based on multiple columns. In this case, you need to mention the list of column names in subset argument.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">duplicate = df[df.duplicated(subset=['Name','Age'])]<br>OR<br>duplicate = df[df.duplicated(['Name','Age'])]<br><br>print(duplicate)<br><br>## output<br><br>   Name  Age      City<br>2  John   28  New York<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"4_Get_duplicate_last_rows\"><\/span>4. Get duplicate last rows<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In each of the above use cases, duplicated() function will keep the first row and flag all the other rows as duplicates. What if you want to retain the last duplicate row instead? In this case, we can use the keep argument. Here is an example, where we find duplicates for &#8216;Age&#8217; column but keep only the last row for each group of duplicates.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">duplicate = df[df.duplicated(subset='Age',keep='last')]<br>print(duplicate)<br><br>## output<br><br>   Name  Age      City<br>0  John   28  New York<br>1  Jane   22     Paris<\/pre>\n\n\n\n<p>Here is an example, where we retain only the first row of duplicates.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">duplicate = df[df.duplicated(subset='Age',keep='first')]<br>print(duplicate)<br><br>## output<br><br>   Name  Age      City<br>2  John   28  New York<br>3   Joe   22    London<br><\/pre>\n\n\n\n<p>As you can see, the output is different in both cases, even though we find duplicates based on the same column &#8216;Age&#8217;.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"5_Get_duplicates_in_sorted_order\"><\/span>5. Get duplicates in sorted order<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Sometimes, the duplicate column values or rows may be scattered all over your dataframe and not present in sequential order. In such cases, you can sort the duplicate values making it easy for users to understand the output. You can sort the dataframe before or after finding duplicates. We will look at both these approaches.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">duplicate = df[df.duplicated(['Name', 'Age'], keep=False)].sort_values('Age')<br>print(duplicate)<br><br>## output<br>   Name  Age      City<br>0  John   28  New York<br>2  John   28  New York<\/pre>\n\n\n\n<p>In the above example, we sort the output of duplicated() function. Here is an example where we sort the dataframe before finding its duplicates.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">sorted_df=df.sort_values(by=['Age'])<br>duplicate = sorted_df[sorted_df.duplicated(['Name', 'Age'],keep=False)]<br>print(duplicate)<br><br>## output<br><br>   Name  Age      City<br>0  John   28  New York<br>2  John   28  New York<\/pre>\n\n\n\n<p>After you have removed duplicates from your dataframe, you can also <a href=\"https:\/\/ubiq.co\/tech-blog\/how-to-write-pandas-dataframe-to-excel-spreadsheets\/\">export it to excel spreadsheet<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In this article, we have learnt several simple ways to easily find and extract the duplicate rows in a Pandas dataframe. You can use duplicated() function for this purpose. It can be used to find completely duplicate rows, or rows with duplicate values for one or more columns. We have also learnt how to determine whether to retain the first or last rows in a set of duplicate rows. Lastly, we learnt how to sort the result of duplicate rows to easily analyze it. You can use any of these methods as per your requirement. Finding and extracting duplicate rows, or rows with duplicate column values, is a very useful requirement for data preparation and cleansing. You can use any of these solutions as per your requirement.<\/p>\n\n\n\n<p>Also read:<\/p>\n\n\n\n<p><a href=\"https:\/\/ubiq.co\/tech-blog\/how-to-merge-and-join-pandas-dataframes\/\">How to Merge and Join Pandas Dataframe<\/a><br><a href=\"https:\/\/ubiq.co\/tech-blog\/how-to-create-pivot-tables-in-python-pandas\/\">How to Create Pivot Tables in Python Pandas<\/a><br><a href=\"https:\/\/ubiq.co\/tech-blog\/how-to-connect-pandas-to-database\/\">How to Connect Pandas to Database<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>It is important to find duplicates in Python Pandas dataframe. Here are different ways to do this.<\/p>\n","protected":false},"author":1,"featured_media":8291,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[301],"tags":[424],"class_list":["post-8265","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python","tag-find-duplicates"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>How to Find Duplicates in Python Pandas Dataframe - Ubiq BI<\/title>\n<meta name=\"description\" content=\"It is important to find duplicates in Python Pandas dataframe. Here are different ways to identify duplicate rows or columns in dataframe.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to Find Duplicates in Python Pandas Dataframe - Ubiq BI\" \/>\n<meta property=\"og:description\" content=\"It is important to find duplicates in Python Pandas dataframe. Here are different ways to identify duplicate rows or columns in dataframe.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/\" \/>\n<meta property=\"og:site_name\" content=\"Ubiq BI\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/ubiqbi\" \/>\n<meta property=\"article:published_time\" content=\"2025-05-19T06:06:02+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-20T07:10:26+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ubiq.co\/tech-blog\/wp-content\/uploads\/2025\/05\/find-duplicates-in-pandas-dataframe.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"269\" \/>\n\t<meta property=\"og:image:height\" content=\"202\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Sreeram Sreenivasan\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@UbiqBI\" \/>\n<meta name=\"twitter:site\" content=\"@UbiqBI\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sreeram Sreenivasan\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/\"},\"author\":{\"name\":\"Sreeram Sreenivasan\",\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/#\\\/schema\\\/person\\\/db98d49a766a3a111d8510935ab90abc\"},\"headline\":\"How to Find Duplicates in Python Pandas Dataframe\",\"datePublished\":\"2025-05-19T06:06:02+00:00\",\"dateModified\":\"2025-05-20T07:10:26+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/\"},\"wordCount\":824,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/ubiq.co\\\/tech-blog\\\/wp-content\\\/uploads\\\/2025\\\/05\\\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1\",\"keywords\":[\"find duplicates\"],\"articleSection\":[\"Python\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/\",\"url\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/\",\"name\":\"How to Find Duplicates in Python Pandas Dataframe - Ubiq BI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/ubiq.co\\\/tech-blog\\\/wp-content\\\/uploads\\\/2025\\\/05\\\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1\",\"datePublished\":\"2025-05-19T06:06:02+00:00\",\"dateModified\":\"2025-05-20T07:10:26+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/#\\\/schema\\\/person\\\/db98d49a766a3a111d8510935ab90abc\"},\"description\":\"It is important to find duplicates in Python Pandas dataframe. Here are different ways to identify duplicate rows or columns in dataframe.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/ubiq.co\\\/tech-blog\\\/wp-content\\\/uploads\\\/2025\\\/05\\\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/ubiq.co\\\/tech-blog\\\/wp-content\\\/uploads\\\/2025\\\/05\\\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1\",\"width\":269,\"height\":202,\"caption\":\"find duplicates in pandas dataframe\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/how-to-find-duplicates-in-python-pandas-dataframe\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to Find Duplicates in Python Pandas Dataframe\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/#website\",\"url\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/\",\"name\":\"Ubiq BI\",\"description\":\"Build dashboards &amp; reports in minutes\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/#\\\/schema\\\/person\\\/db98d49a766a3a111d8510935ab90abc\",\"name\":\"Sreeram Sreenivasan\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/4b3127ed2d4bb8efb3fa0bbb52cf2efd4d0156c97fc05a503537c883e8279947?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/4b3127ed2d4bb8efb3fa0bbb52cf2efd4d0156c97fc05a503537c883e8279947?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/4b3127ed2d4bb8efb3fa0bbb52cf2efd4d0156c97fc05a503537c883e8279947?s=96&d=mm&r=g\",\"caption\":\"Sreeram Sreenivasan\"},\"description\":\"Sreeram Sreenivasan is the Founder of Ubiq. He has helped many Fortune 500 companies in the areas of BI &amp; software development.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/sreeram-sreenivasan\\\/\"],\"url\":\"https:\\\/\\\/ubiq.co\\\/tech-blog\\\/author\\\/wordpress\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How to Find Duplicates in Python Pandas Dataframe - Ubiq BI","description":"It is important to find duplicates in Python Pandas dataframe. Here are different ways to identify duplicate rows or columns in dataframe.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/","og_locale":"en_US","og_type":"article","og_title":"How to Find Duplicates in Python Pandas Dataframe - Ubiq BI","og_description":"It is important to find duplicates in Python Pandas dataframe. Here are different ways to identify duplicate rows or columns in dataframe.","og_url":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/","og_site_name":"Ubiq BI","article_publisher":"https:\/\/www.facebook.com\/ubiqbi","article_published_time":"2025-05-19T06:06:02+00:00","article_modified_time":"2025-05-20T07:10:26+00:00","og_image":[{"width":269,"height":202,"url":"https:\/\/ubiq.co\/tech-blog\/wp-content\/uploads\/2025\/05\/find-duplicates-in-pandas-dataframe.jpg","type":"image\/jpeg"}],"author":"Sreeram Sreenivasan","twitter_card":"summary_large_image","twitter_creator":"@UbiqBI","twitter_site":"@UbiqBI","twitter_misc":{"Written by":"Sreeram Sreenivasan","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#article","isPartOf":{"@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/"},"author":{"name":"Sreeram Sreenivasan","@id":"https:\/\/ubiq.co\/tech-blog\/#\/schema\/person\/db98d49a766a3a111d8510935ab90abc"},"headline":"How to Find Duplicates in Python Pandas Dataframe","datePublished":"2025-05-19T06:06:02+00:00","dateModified":"2025-05-20T07:10:26+00:00","mainEntityOfPage":{"@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/"},"wordCount":824,"commentCount":0,"image":{"@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/ubiq.co\/tech-blog\/wp-content\/uploads\/2025\/05\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1","keywords":["find duplicates"],"articleSection":["Python"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/","url":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/","name":"How to Find Duplicates in Python Pandas Dataframe - Ubiq BI","isPartOf":{"@id":"https:\/\/ubiq.co\/tech-blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#primaryimage"},"image":{"@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/ubiq.co\/tech-blog\/wp-content\/uploads\/2025\/05\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1","datePublished":"2025-05-19T06:06:02+00:00","dateModified":"2025-05-20T07:10:26+00:00","author":{"@id":"https:\/\/ubiq.co\/tech-blog\/#\/schema\/person\/db98d49a766a3a111d8510935ab90abc"},"description":"It is important to find duplicates in Python Pandas dataframe. Here are different ways to identify duplicate rows or columns in dataframe.","breadcrumb":{"@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#primaryimage","url":"https:\/\/i0.wp.com\/ubiq.co\/tech-blog\/wp-content\/uploads\/2025\/05\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1","contentUrl":"https:\/\/i0.wp.com\/ubiq.co\/tech-blog\/wp-content\/uploads\/2025\/05\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1","width":269,"height":202,"caption":"find duplicates in pandas dataframe"},{"@type":"BreadcrumbList","@id":"https:\/\/ubiq.co\/tech-blog\/how-to-find-duplicates-in-python-pandas-dataframe\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ubiq.co\/tech-blog\/"},{"@type":"ListItem","position":2,"name":"How to Find Duplicates in Python Pandas Dataframe"}]},{"@type":"WebSite","@id":"https:\/\/ubiq.co\/tech-blog\/#website","url":"https:\/\/ubiq.co\/tech-blog\/","name":"Ubiq BI","description":"Build dashboards &amp; reports in minutes","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ubiq.co\/tech-blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/ubiq.co\/tech-blog\/#\/schema\/person\/db98d49a766a3a111d8510935ab90abc","name":"Sreeram Sreenivasan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/4b3127ed2d4bb8efb3fa0bbb52cf2efd4d0156c97fc05a503537c883e8279947?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/4b3127ed2d4bb8efb3fa0bbb52cf2efd4d0156c97fc05a503537c883e8279947?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4b3127ed2d4bb8efb3fa0bbb52cf2efd4d0156c97fc05a503537c883e8279947?s=96&d=mm&r=g","caption":"Sreeram Sreenivasan"},"description":"Sreeram Sreenivasan is the Founder of Ubiq. He has helped many Fortune 500 companies in the areas of BI &amp; software development.","sameAs":["https:\/\/www.linkedin.com\/in\/sreeram-sreenivasan\/"],"url":"https:\/\/ubiq.co\/tech-blog\/author\/wordpress\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/ubiq.co\/tech-blog\/wp-content\/uploads\/2025\/05\/find-duplicates-in-pandas-dataframe.jpg?fit=269%2C202&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/pbGGTT-29j","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/posts\/8265","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/comments?post=8265"}],"version-history":[{"count":22,"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/posts\/8265\/revisions"}],"predecessor-version":[{"id":8325,"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/posts\/8265\/revisions\/8325"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/media\/8291"}],"wp:attachment":[{"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/media?parent=8265"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/categories?post=8265"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ubiq.co\/tech-blog\/wp-json\/wp\/v2\/tags?post=8265"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}