### Study area

The study area is shown in Fig. 1.

As an important part of the Yangtze River Delta city cluster, the economic development speed in Anhui Province is relatively backward and the urbanization has not reached the national average level, Especially to the Yangtze River Delta city cluster. Therefore, the new development concept throughout the whole process and all fields of new urbanization in Anhui Province began to be emphasized with the intensive and compact urban development and the low-carbon transformation as the lead. Furthermore, with the continuous advancement of new urbanization in Anhui Province, the interaction between new urbanization and carbon emission has attracted extensive attention from the society. So it is of practical research significance to take Anhui Province as the research object to analyze the coupling relationship between new urbanization and carbon emission, and then provide reference for the construction of low-carbon urbanization in Anhui Province and other provinces in China.

### Data sources and indicators

#### Data sources

The selected indicator data were obtained from the National Bureau of Statistics of China and the Statistical Yearbook of Anhui Province (2012–2021), and the missing data were obtained by interpolation. Interpolation is a widely used method for finding unknown data through the known by using equivalence in mathematical disciplines^{27,28,29}. In order to make the data more relevant, this study used excel polynomial trend lines for the unknown data, which resulted in a better fit of the data trend lines^{30}.

#### Indicators

Based on the basic principles of scientificity, rationality and usability, the new urbanization evaluation indicator system including population urbanization, economic urbanization, social urbanization, ecological urbanization and innovation urbanization was constructed with reference to the connotation of new urbanization in the National New Urbanization Plan (2021–2035) and existing academic research above^{31,32,33,34,35,36,37,38}. Then, the carbon emission evaluation indicator system including population carbon emission, economic carbon emission and energy carbon emission was constructed^{39,40,41}. The specific indicators are described as follows:

(i) Population urbanization, which reflects the population factor dimension of new urbanization. The specific indicators under this dimension include the urbanization of resident population, urban population density and registered unemployment rate of urban population^{42}. The urbanization rate and population density are the visual representation of the urbanization aggregation trend. The registered unemployment rate of urban populations reflects whether urbanization has brought employment opportunities and livelihood security to residents.

(ii) Economic urbanization, which reflects the economic factor dimension of new urbanization. The specific indicators under this dimension include GDP per capita, the proportion of output value of secondary industry to GDP, the proportion of output value of tertiary industry to GDP, and the growth rate of fixed asset investment^{43,44}. GDP per capita measures economic development. The proportion of secondary and tertiary industries represents the industrial structure, which is an important part of the socioeconomic system. The growth rate of fixed asset investment represents the speed dynamics of infrastructure project construction in the urbanization process.

(iii) Social urbanization, which reflects the social factor dimension of new urbanization. Specific indicators under this dimension include the number of health technicians per 10,000 people, the number of college students per 100,000 people, the number of private car ownership, the ratio of per capita disposable income of urban and rural residents, and the number of urban and rural residents with low insurance^{8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46}. Obviously, the above indicators represent the development of social security and education in the process of new urbanization.

(iv) Ecological urbanization, which reflects the ecological factor dimension of new urbanization. Specific indicators under this dimension include the greening coverage rate of urban built-up areas, per capita park green space area, household waste disposal rate, and per capita water resources. It represents the level of municipal health and ecological environment in the process of new urbanization.

(v) Innovation urbanization, which reflects the innovation factor dimension of new urbanization. Specific indicators under this dimension include the number of patents granted for inventions and R&D personnel^{49,50}. Indispensable to the promotion of new urbanization is the soft power such as innovation and invention.

(vi) Population carbon emission, which is the carbon emission embodied in the new urbanization process at the dimension of population factors. The specific indicators under this dimension include electricity consumption and new energy vehicle production^{51,52}. With the development of urbanization, the living standard and consumption level of the population will rise significantly, which leads to a gradual increase in the total electricity consumption and automobile consumption of the population, which directly affects the level of carbon emission.

(vii) Economic carbon emission, which is the carbon emission embodied in the process of new urbanization. The specific indicators under this dimension include the output value of strategic emerging industries and energy consumption per unit of GDP^{53,54,55}. Strategic emerging industries are the deep integration of emerging technologies and emerging industries. Energy consumption per unit of GDP reflects changes in economic structure and energy utilization efficiency. Combined with related literature^{56,57}, both of them can be chosen as economic indicators to measure carbon emission level.

(viii) Energy carbon emission, which is the carbon emission embodied in the process of new urbanization. The specific indicators under this dimension include total energy production and total energy consumption^{58,59}. Both energy production and consumption can be categorized as energy activities, which are important sources of greenhouse gas emission and are energy-based indicators of carbon emission level.

### Indicator system construction

According to the selection of indicators, the matching degree evaluation indicator system between new urbanization and carbon emission is formed in Anhui Province in Table 1.

The comprehensive development evaluation model is used to evaluate the matching degree relationship between new urbanization and carbon emission in Anhui Province, and the specific steps are given as follows:

(1) The dimensionless treatment of different tendency indicators can be obtained by formulas (1)–(2).

Positive trend:

$$yij = \frac{xij – \min xij}{{\max xij – \min xij}}$$

(1)

Negative trend:

$$yij = \frac{\max xij – xij}{{\max xij – \min xij}}$$

(2)

where, \(x_{ij}\) is the value of the \(j\) indicator in the \(\mathrm{i}\) year, and the values with different trends can be normalized to \(y_{ij}\) by formulas (1)–(2) to facilitate the non-differential processing of subsequent values.

(2) The information entropy \(ej\) and the weights of each indicator \(\omega_{j}\) can be obtained by formulas (3)–(5).

$$ej = – \frac{1}{\ln m}\sum\limits_{i = 1}^{m} {fij\ln fij}$$

(3)

$$fij = \frac{yij + 1}{{\sum\limits_{i = 1}^{m} {\left( {yij + 1} \right)} }}$$

(4)

$$\omega_{j} = \frac{1 – ej}{{n – \sum\limits_{j = 1}^{n} {ej} }}$$

(5)

where, \(y_{ij}\) is the standardized value of the \(\mathrm{j}\) indicator, which is obtained by formulas (1)–(2) in year \(\mathrm{i}\), and \(\mathrm{m}\) is the time category, which is 10 in this study. The entropy \(ej\) of each indicator can be obtained by formulas (3)–(4). Then, the weights \(\omega_{j}\) of each indicator are obtained by formula (5) and correspond to the importance of each indicator in the evaluation system.

(3) The comprehensive development level between new urbanization and carbon emission can be obtained by formulas (6)–(8).

$$F\left( {x{}_{1}} \right) = \sum\limits_{{j_{1} = 1}}^{{n_{1} }} {a_{j1} \times X_{j1} }$$

(6)

$$F\left( {x_{2} } \right) = \sum\limits_{{j_{2} = 1}}^{{n_{2} }} {{\text{a}}_{{j_{2} }} \times X_{j2} }$$

(7)

$$T = \alpha F\left( {x_{1} } \right) + \beta F\left( {x_{2} } \right)$$

(8)

where, \(F\left( {x_{1} } \right)\) and \(F\left( {x_{2} } \right)\) represent new urbanization and carbon emission development indicator respectively. \(a_{j1}\) and \(a_{j2}\) represent the weight of the \(\mathrm{j}\) indicator of new urbanization and carbon emission respectively; \(X_{j1}\) and \(X_{j2}\) are the standardized values of the \(\mathrm{j}\) indicator of new urbanization and carbon emission respectively. \(n_{1}\) and \(n_{2}\) are the indicators of new urbanization and carbon emission respectively. Formula (6) represents the new urbanization development coefficient. Formula (7) represents the carbon emission development coefficient. Formula (8) represents the development coefficient of the comprehensive development level between the two systems. Based on the research results of previous scholars, it is concluded that new urbanization and carbon emission are equally important^{60,61}, so \(\alpha = \beta = 0.5\). \(T\) is the comprehensive development coefficient.

### Coupling coordination degree model

The coupling coordination degree model is commonly used to measure the coordination development level of several systems^{62}. It can not only reveal the interaction relationship between new urbanization and carbon emission, but also find the effective path to realize the coordinated development through their internal dynamic relationship.

#### Coupling degree

The coupling degree model is used to the interaction between new urbanization and carbon emission system, which is the basis of the coupling coordination degree model. The specific step is formula (9).

$$C = \left( {\frac{{F\left( {x_{1} } \right)F\left( {x_{2} } \right)}}{{F\left( {x_{1} } \right) + F\left( {x_{2} } \right)^{2} }}} \right)^{\frac{1}{2}}$$

(9)

where, \(C(0 \le C \le 1)\) reflects the coupling degree between new urbanization and carbon emission, and the value of it is positively correlated with the coupling between the two systems.

#### Coupling coordination degree

Compared with the coupling degree model, the coupling coordination degree model has higher stability and wider application scope. For the time series of the studied area, all of them can be evaluated and compared quantitatively. The specific step is formula (10).

$$D = \sqrt {C \times T}$$

(10)

where, the value of \(\mathrm{D}\) obtained by formula (10) is the coupling coordination degree, which can be used to evaluate the effect between new urbanization and carbon emission.

#### Matching coefficient

The matching coefficient is used to analyze the relative lag between the development level between new urbanization and carbon emission. Then the influence relationship between the two systems is analyzed by combined with formula (10). The specific step is formula (11).

$$U = \frac{{F\left( {x_{1} } \right)}}{{F\left( {x_{2} } \right)}}$$

(11)

where, the value of \(\mathrm{U}\) obtained by formula (11) is the matching coefficient, which can be used to observe and analyze the relative lag trend between new urbanization and carbon emission.

### Gray prediction model

Grey prediction model is a method to predict from a small amount of incomplete information by establishing a mathematical model.It is an effective tool for dealing with small sample prediction problems. The principle is to generate data series with strong regularity through correlation analysis of the original data, and then the corresponding differential equation model was used to predict the development trend^{63,64,65,66,67}. The main contribution of the gray prediction model in this study is to predict the coordination degree between new urbanization and carbon emission for 2023–2032 in Anhui Province. The aim is to provide suggestions for the construction of low-carbon urbanization by combining the prediction of the coupling coordination phase.The specific steps are as follows.

First, the original data of new urbanization and carbon emission are sequentially constructed (\(X^{\left( 0 \right)} = \left\{ {x^{\left( 0 \right)}_{\left( 1 \right)} ,x^{\left( 0 \right)}_{\left( 2 \right)} , \ldots x^{\left( 0 \right)} \left( n \right)} \right\},X^{\left( 0 \right)} \left( k \right) \ge 0,k = 1,2, \ldots n\), and then \(r\) times cumulative sequence is generated in formula (12).

$$x^{\left( r \right)} \left( k \right) = \sum\limits_{i = 1}^{k} {x^{{\left( {r – 1} \right)}} \left( i \right)} ,k = 1,2, \ldots ,n,r \ge 1$$

(12)

where, this is a group of irregular values through the accumulation of formula (12) to form a group of regular values.

Second, the adjacent values are obtained to generate the sequence,Then the gray differential equation model is constructed in formulas (13)–(14).

$$z^{\left( 1 \right)} \left( k \right) = \alpha x^{\left( 1 \right)} \left( k \right) + \left( {1 – \alpha } \right)x^{\left( 1 \right)} \left( {k – 1} \right),k = 1,2, \ldots ,n,\;\alpha = 0.5$$

(13)

$$d\left( k \right) + az^{\left( 1 \right)} \left( k \right) = b$$

(14)

where, \(d\left( k \right)\) is the gray derivative, \(a\) is the development coefficient, \(z^{\left( 1 \right)} \left( k \right)\) is the whitening background value, and \(b\) is the gray action.

Third, the data matrix is constructed to obtain the corresponding function of time. Then the cumulative generation prediction sequence is obtained through the cumulative generation in formulas (15)–(17).

$$U = \left( {\begin{array}{*{20}c} a \\ b \\ \end{array} } \right),\;B = \left( {\begin{array}{*{20}c} { – z^{\left( 1 \right)} \left( 2 \right)} & 1 \\ { – z^{\left( 1 \right)} \left( 3 \right)} & 1 \\ \vdots & \vdots \\ { – z^{\left( 1 \right)} \left( n \right)} & 1 \\ \end{array} } \right),\;Y = \left( {\begin{array}{*{20}c} {x^{\left( 0 \right)} \left( 2 \right)} \\ {x^{\left( 0 \right)} \left( 3 \right)} \\ \vdots \\ {x^{\left( 0 \right)} \left( n \right)} \\ \end{array} } \right),\;Y = BU$$

(15)

$$x^{\left( 1 \right)} \left( k \right) = \left( {x^{\left( 0 \right)} \left( 1 \right) – \frac{b}{a}} \right)e^{{ – a\left( {k – 1} \right)}} + \frac{b}{a},k = 1,2, \ldots ,n$$

(16)

$$x^{\left( 0 \right)} \left( k \right) = x^{\left( 1 \right)} \left( k \right) – x^{\left( 1 \right)} \left( {k – 1} \right),k = 1,2, \ldots ,n$$

(17)

where, formulas (13) and (14) are the premises of formula (15), the gray prediction model can be expressed as \(Y = BU\). The cumulative predicted value sequence \(x^{\left( 1 \right)} \left( k \right)\) obtained from formula (16) can be used to get annual predicted value through the inverse process of formula (12), that is, the application of formula (17).

Finally, the predicted value is compared with the actual value to verify whether it passes the residual test in formula (18).

$$\varepsilon \left( k \right) = \frac{{x^{\left( 0 \right)} \left( k \right) – \hat{x}^{\left( 0 \right)} \left( k \right)}}{{x^{\left( 0 \right)} \left( k \right)}},k = 1,2, \ldots ,n$$

(18)

where, \(\left| {\varepsilon \left( k \right)} \right| < 0.1\) indicates high residual test accuracy and prediction accuracy.

The prediction data verified by formula (18) has scientific and reasonable application value.