python handle Excel

Related instructions :
1,Python Bring your own csv Module can handle .csv file .

2,xlrd and xlwt The two modules are used to read Excel And write Excel, Support only .xls and .xlsx format
,xlutils The module can read and write an existing one at the same time Excel file , Depend on xlrd and xlwt.

3,openpyxl( read-write excel surface )
Special treatment Excel2007 And above xlsx file ;2007 The following version is xls End of file , Need to use xlrd( read ) and xlwt( write ) Library

4,excel If the text code of the table is “gb2312” After reading, garbled code will be displayed , Please transfer to Unicode

5,workbook: workbook , One excel The file contains multiple sheet.

6,sheet: Worksheet , One workbook There are many , Table name recognition , as “sheet1”,“sheet2” etc .

7,cell: Cell , Store data objects

8, Data types in common cells empty( Empty ),string(text),number, date, boolean, error,blank( blank )

use xlrd read Excel

xlrd There are many interfaces provided , The common methods are as follows :

method         explain
open_workbook()

  Open the specified Excel file , Return a Book object : adopt Book Object can get the Sheet object ( One Excel There can be multiple files Sheet, each Sheet It's a form )

Book.nsheets     return Sheet Number of
Book.sheets() Return to all Sheet Object list
Book.sheet_by_index(index)   Returns the value at the specified index Sheet. amount to Book.sheets()[index]
Book.sheet_names()  Return to all Sheet Object name list( For a str List of types )
Book.sheet_by_name(name) As specified Sheet Object name return She
adopt Sheet Object to get individual cells , Each cell is a Cell object

Method statement
Sheet.name   Returns the name of the table
Sheet.nrows  Returns the number of rows in the table
Sheet.ncols     Returns the number of columns in the table
Sheet.row(r)  Gets the specified row , return Cell Object list
Sheet.col(c)  Gets the specified column , return Cell Object list
Sheet.row_values(r)   Gets the value of the specified row , return list
Sheet.col_values(c)  Gets the value of the specified column , return list
Sheet.cell(r, c)   Get by location Cell object
Sheet.cell_value(r, c)  Get by location Cell The value of the object
Cell.value Returns the value of the cell
notes :
On the whole, we are dealing with it Excel The steps are :
Get the data to be processed first Excel File object (Book object ,open_workbook()), Then get the sheet object (Book.sheet_by_index(index) And so on ), stay sheet Object

 

This test Excel file

 

Get worksheet object

We know that a workbook can contain multiple worksheets , When we get “ Workbook objects ” after , You can then get the worksheet object , Can pass “ Indexes ” The way to get , You can also pass the “ Table name ” The way to get .

example 1:
adopt Book.sheets( ) Method to obtain sheet object
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # Return to all Sheet Object list
all_sheet = excel.sheets()#Book( workbook ) Object method print(all_sheet) # Traversing returned Sheet Object list
for each_sheet in all_sheet:     print(each_sheet)    
print("sheet The name is :",each_sheet.name)#sheet Object method """ [<xlrd.sheet.Sheet object at
0x0000021983B380F0>, <xlrd.sheet.Sheet object at 0x0000021983B38208>,
<xlrd.sheet.Sheet object at 0x0000021983B382B0>] <xlrd.sheet.Sheet object at
0x0000021983B380F0> sheet The name is : Sheet1 <xlrd.sheet.Sheet object at
0x0000021983B38208> sheet The name is : Sheet2 <xlrd.sheet.Sheet object at
0x0000021983B382B0> sheet The name is : Sheet3 """
example 1_1:
adopt Book.sheet_by_index(index) Method to obtain sheet object
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # Return to all Sheet Object list
all_sheet = excel.sheets() print(all_sheet) for i in range(len(all_sheet)):    
each_sheet_by_index = excel.sheet_by_index(i)# adopt sheet Index acquisition sheet object    
print(" The table name is :{0}, Type is :{1}".format(each_sheet_by_index,
type(each_sheet_by_index)))     print("sheet The name is :", each_sheet_by_index.name)  #
sheet Object method """ [<xlrd.sheet.Sheet object at 0x0000023D55D89128>,
<xlrd.sheet.Sheet object at 0x0000023D55D89240>, <xlrd.sheet.Sheet object at
0x0000023D55D892E8>] The table name is :<xlrd.sheet.Sheet object at
0x0000023D55D89128>, Type is :<class 'xlrd.sheet.Sheet'> sheet The name is : Sheet1
The table name is :<xlrd.sheet.Sheet object at 0x0000023D55D89240>, Type is :<class
'xlrd.sheet.Sheet'> sheet The name is : Sheet2 The table name is :<xlrd.sheet.Sheet object at
0x0000023D55D892E8>, Type is :<class 'xlrd.sheet.Sheet'> sheet The name is : Sheet3 """
notes :
You can see from the code above , This usage (excel.sheet_by_index(i)) In fact, it is Book.sheets()[index] identical , Just an example 1 The usage in is simple

example 1_2:
Book.sheet_names() : This is not the only way to get it sheet object , It's just one str List of types (sheet List of object names )
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # Return to all Sheet Object name list
all_sheet = excel.sheet_names() print(all_sheet) # Traversing returned Sheet Object name list for
each_sheet_by_name in all_sheet:    
print(" The table name is :{0}, Type is :{1}".format(each_sheet_by_name ,type(each_sheet_by_name )))
    #print(each_sheet_by_name.ncols)      """ ['Sheet1', 'Sheet2', 'Sheet3']
The table name is :Sheet1, Type is :<class 'str'> The table name is :Sheet2, Type is :<class 'str'>
The table name is :Sheet3, Type is :<class 'str'> """
notes :
As can be seen from the above example, the traversed object is only a string type object ('Sheet1', 'Sheet2',
'Sheet3'), instead of sheet object , Therefore, it cannot be used seet Object method (Sheet.name etc. )

example 1_2:
Obtained by the table name index :Book.sheet_by_name(name)
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # Return to all Sheet Object name list
all_sheet = excel.sheet_names() #workbook.sheet_names()
Return a list object , You can do this list Object print(all_sheet) for i in all_sheet:    
each_sheet_by_name = excel.sheet_by_name(i)# adopt sheet Index acquisition sheet object    
print(" The table name is :{0}, Type is :{1}".format(each_sheet_by_name, type(each_sheet_by_name)))
    print("sheet The name is :", each_sheet_by_name.name)  # sheet Object method """ ['Sheet1',
'Sheet2', 'Sheet3'] The table name is :<xlrd.sheet.Sheet object at
0x00000202BF748080>, Type is :<class 'xlrd.sheet.Sheet'> sheet The name is : Sheet1
The table name is :<xlrd.sheet.Sheet object at 0x00000202BF748198>, Type is :<class
'xlrd.sheet.Sheet'> sheet The name is : Sheet2 The table name is :<xlrd.sheet.Sheet object at
0x00000202BF748240>, Type is :<class 'xlrd.sheet.Sheet'> sheet The name is : Sheet3 """
example 1_4:
In addition to the above methods , There's another way to get it sheet Method of object
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # return Sheet Number of
all_sheet_count = excel.nsheets print(all_sheet_count) for i in
range(all_sheet_count):     sheet =
excel.sheet_by_index(i)# By index sheet Number , Similar to the second method    
print(" The table name is :{0}, Type is :{1}".format(sheet , type(sheet ))) """ 3
The table name is :<xlrd.sheet.Sheet object at 0x000001EA57F39048>, Type is :<class
'xlrd.sheet.Sheet'> The table name is :<xlrd.sheet.Sheet object at
0x000001EA57F39160>, Type is :<class 'xlrd.sheet.Sheet'> The table name is :<xlrd.sheet.Sheet
object at 0x000001EA57F39208>, Type is :<class 'xlrd.sheet.Sheet'> """
 

Get the basic information of the worksheet

In obtaining “ Table object ” after , We can get basic information about the worksheet . Include table name , The number of rows and columns, etc
example 2: obtain sheet The name of the object , Number of rows , Number of columns
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # obtain sheet object all_sheet =
excel.sheets() # Loop through each sheet object sheet_name = [] sheet_row = [] sheet_col = [] for
sheet in all_sheet:     sheet_name.append(sheet.name)    
print(" The Excel share {0} individual sheet, current sheet The name is {1}, The sheet share {2} That's ok ,{3} column "          
.format(len(all_sheet),sheet.name,sheet.nrows,sheet.ncols))    
sheet_row.append(sheet.nrows)     sheet_col.append(sheet.ncols)
print(sheet_name)# obtain sheet Name of print(sheet_row )# obtain sheet The number of rows of the object
print(sheet_col)# obtain sheet The number of columns for the object """
The Excel share 3 individual sheet, current sheet The name is Sheet1, The sheet share 13 That's ok ,9 column
The Excel share 3 individual sheet, current sheet The name is Sheet2, The sheet share 9 That's ok ,10 column
The Excel share 3 individual sheet, current sheet The name is Sheet3, The sheet share 18 That's ok ,8 column ['Sheet1', 'Sheet2', 'Sheet3']
[13, 9, 18] [9, 10, 8] """

example 3: Get data from a worksheet by row or column
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # obtain sheet object all_sheet =
excel.sheets() # Loop through each sheet object sheet_name = [] sheet_row = [] sheet_col = [] for
sheet in all_sheet:     sheet_name.append(sheet.name)    
print(" The Excel share {0} individual sheet, current sheet The name is {1}, The sheet share {2} That's ok ,{3} column "          
.format(len(all_sheet),sheet.name,sheet.nrows,sheet.ncols))     for each_row in
range(sheet.nrows):# Circulates each line         print(" Currently %s That's ok :"% each_row,type(each_row))  
      print(sheet.row_values(each_row),type(sheet.row_values(each_row)))
first_row_value = sheet.row_values(0)# Prints a specified line print(" The data in the first line is :%s" %
first_row_value) """ The Excel share 3 individual sheet, current sheet The title is less than one to eight , The sheet share 4 That's ok ,6 column Currently 0 That's ok :
<class 'int'> [' region ', ' Unit name ', ' Position code ', ' Title of position ', ' Number of recruits ', ' Number of applicants '] <class 'list'>
Currently 1 That's ok : <class 'int'> [' Provincial units ', ' Zhejiang Police Vocational Academy ', '13312004000000001', ' Teaching of judicial police major ', 1.0,
'0'] <class 'list'> Currently 2 That's ok : <class 'int'> [' Provincial units ', ' Zhejiang Police Vocational Academy ',
'13312004000000002', ' Law Teaching ', 1.0, '1'] <class 'list'> Currently 3 That's ok : <class 'int'>
....... The data in the first line is :[' region ', ' Unit name ', ' Position code ', ' Title of position ', ' Number of recruits ', ' Number of applicants '] """
notes :
1, As can be seen from the example above , In obtaining sheet After the number of rows of the object , You can use the loop traversal method (for each_row in
range(sheet.nrows), Pass again sheet.row_values(each_row) Method to get the data of each row

2, The index method can be used to obtain the data of a specified row (sheet.row_values(0)): Actually, it's the same as above 1 In fact, they are consistent

3, It can be seen from the output results : The data type of each row obtained is a list , The data of each cell is an element in the list , All the data in a row make up a list
 

example 3_1:
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # obtain sheet object all_sheet =
excel.sheets() # Loop through each sheet object sheet_name = [] sheet_row = [] sheet_col = [] for
sheet in all_sheet:     sheet_name.append(sheet.name)    
print(" The Excel share {0} individual sheet, current sheet The name is {1}, The sheet share {2} That's ok ,{3} column "          
.format(len(all_sheet),sheet.name,sheet.nrows,sheet.ncols))    
print(sheet.col_values(0))# Gets the data of the specified column     for each_col in
range(sheet.ncols):# Get the data of each column in turn         print(" Currently %s column :"% each_col )        
print(sheet.col_values(each_col ),type(sheet.col_values(each_col ))) """
The Excel share 3 individual sheet, current sheet The title is less than one to eight , The sheet share 4 That's ok ,6 column [' region ', ' Provincial units ', ' Provincial units ', ' Provincial units ']
Currently 0 column : [' region ', ' Provincial units ', ' Provincial units ', ' Provincial units '] <class 'list'> Currently 1 column : [' Unit name ',
' Zhejiang Police Vocational Academy ', ' Zhejiang Police Vocational Academy ', ' Zhejiang Police Vocational Academy '] <class 'list'> Currently 2 column : [' Position code ',
'13312004000000001', '13312004000000002', '13312004000000003'] <class 'list'>
Currently 3 column : [' Title of position ', ' Teaching of judicial police major ', ' Law Teaching ', ' Civil aviation air security teaching '] <class 'list'> Currently 4 column : ..... """
notes :
1, As can be seen from the example above : You can traverse all the columns (for each_col in
range(sheet.ncols)) To get the data of all columns , You can also get the data of the specified column through the index

2, The data returned is still a list : The data of each cell in a column is an element in the list

 

Gets the row or column object

adopt sheet Object Sheet.row(r) or Sheet.col(c) Gets the specified row or column , return Cell Object list
example 4:
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # obtain sheet object all_sheet =
excel.sheets() # Loop through each sheet object for sheet in all_sheet:    
print(" The Excel share {0} individual sheet, current sheet The name is {1}, The sheet share {2} That's ok ,{3} column "          
.format(len(all_sheet),sheet.name,sheet.nrows,sheet.ncols))    
sheet_cell_first_row = sheet.row(0)# Gets the specified row object     print(sheet_cell_first_row)    
for i in range(sheet.nrows):# Each row of objects is traversed in turn         each_cell_value_row =
sheet.row(i)         print(each_cell_value_row,type(each_cell_value_row)) """
The Excel share 3 individual sheet, current sheet The title is less than one to eight , The sheet share 4 That's ok ,6 column [text:' region ', text:' Unit name ',
text:' Position code ', text:' Title of position ', text:' Number of recruits ', text:' Number of applicants '] [text:' region ', text:' Unit name ',
text:' Position code ', text:' Title of position ', text:' Number of recruits ', text:' Number of applicants '] <class 'list'>
[text:' Provincial units ', text:' Zhejiang Police Vocational Academy ', text:'13312004000000001', text:' Teaching of judicial police major ',
number:1.0, text:'0'] <class 'list'> [text:' Provincial units ', text:' Zhejiang Police Vocational Academy ',
text:'13312004000000002', text:' Law Teaching ', number:1.0, text:'1'] <class 'list'>
[text:' Provincial units ', text:' Zhejiang Police Vocational Academy ', text:'13312004000000003', text:' Civil aviation air security teaching ',
number:1.0, text:'1'] <class 'list'>
The Excel share 3 individual sheet, current sheet Positions with names greater than one to eighty , The sheet share 11 That's ok ,6 column """
notes :
As can be seen from the output above : The type of the returned result is a list of nested dictionaries , The key of the dictionary is the default value (test), The dictionary value is the value of each cell ( data )

example 4_1:
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # obtain sheet object all_sheet =
excel.sheets() # Loop through each sheet object for sheet in all_sheet:    
print(" The Excel share {0} individual sheet, current sheet The name is {1}, The sheet share {2} That's ok ,{3} column "          
.format(len(all_sheet),sheet.name,sheet.nrows,sheet.ncols))    
sheet_cell_first_col = sheet.col(0)# Gets the specified column object     print(sheet_cell_first_col)    
for i in range(sheet.ncols):# Each column of objects is traversed in turn         each_cell_value_col =
sheet.col(i)         print(each_cell_value_col,type(each_cell_value_col)) """
The Excel share 3 individual sheet, current sheet The title is less than one to eight , The sheet share 4 That's ok ,6 column [text:' region ', text:' Provincial units ',
text:' Provincial units ', text:' Provincial units '] [text:' region ', text:' Provincial units ', text:' Provincial units ', text:' Provincial units ']
<class 'list'> [text:' Unit name ', text:' Zhejiang Police Vocational Academy ', text:' Zhejiang Police Vocational Academy ', text:' Zhejiang Police Vocational Academy ']
<class 'list'> [text:' Position code ', text:'13312004000000001',
text:'13312004000000002', text:'13312004000000003'] <class 'list'>
[text:' Title of position ', text:' Teaching of judicial police major ', text:' Law Teaching ', text:' Civil aviation air security teaching '] <class 'list'>
[text:' Number of recruits ', number:1.0, number:1.0, number:1.0] <class 'list'> [text:' Number of applicants ',
text:'0', text:'1', text:'1'] <class 'list'>
The Excel share 3 individual sheet, current sheet Positions with names greater than one to eighty , The sheet share 11 That's ok ,6 column """
 

Get the data of a cell

1, We can also pinpoint the query to a cell . stay xlrd In the module , The rows and columns of the worksheet are derived from the 0 Start counting

2, Cell : A cell is the intersection of rows and columns in a table , It is the smallest unit that makes up a table , Can be split or merged . The input and modification of single data are carried out in cells
example 5:
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # obtain sheet object all_sheet =
excel.sheets() # Loop through each sheet object for sheet in all_sheet:
print(" The Excel share {0} individual sheet, current sheet The name is {1}, The sheet share {2} That's ok ,{3} column "
.format(len(all_sheet),sheet.name,sheet.nrows,sheet.ncols)) sheet_cell =
sheet.cell(0,0)# Get by location Cell object print(sheet_cell) sheet_cell_value =
sheet_cell.value# Returns the value of the cell print(sheet_cell_value) sheet_cell_value_1 =
sheet.cell_value(0,0)# Get by location Cell The value of the object print(sheet_cell_value_1) """
The Excel share 3 individual sheet, current sheet The title is less than one to eight , The sheet share 4 That's ok ,6 column text:' region ' region region
The Excel share 3 individual sheet, current sheet Positions with names greater than one to eighty , The sheet share 11 That's ok ,6 column text:' region ' region region
The Excel share 3 individual sheet, current sheet The name is Sheet1, The sheet share 22 That's ok ,6 column text:' region ' region region """
 

expand :

Gets the value of each cell ( data )
example 6:
import xlrd excel_path = "F:\\Excel_demo\\demo.xls"
# Open file , obtain excel Of documents workbook( workbook ) object excel =
xlrd.open_workbook(excel_path,encoding_override="utf-8") # obtain sheet object sheet =
excel.sheets()[0] sheet_row_mount = sheet.nrows#4 That's ok sheet_col_mount =
sheet.ncols#6 column print(sheet_row_mount,sheet_col_mount) for x in
range(sheet_row_mount):#4     y = 0     while y < sheet_col_mount:#6        
print(sheet.cell_value(x,y))         y += 1 """ 4 6 region Unit name Position code Title of position Number of recruits Number of applicants
Provincial units Zhejiang Police Vocational Academy 13312004000000001 Teaching of judicial police major 1.0 0 Provincial units ...... """
notes :
Excel The index of the rows and columns in the file is from the 0 The beginning ( And the file returns a list ), So if the judgment condition in the above example is while y <=
sheet_col_mount It will report an error :IndexError: list index out of range

 

Technology