Economic Evaluation of an Investment in Medical Websites and Medical Web-Based Services


Factors that affect the time that the users spend on the medical websites



Download 1.95 Mb.
Page19/35
Date02.02.2017
Size1.95 Mb.
#15326
1   ...   15   16   17   18   19   20   21   22   ...   35

Factors that affect the time that the users spend on the medical websites

Ranking the medical websites according to the “Global Rank” variable’s value and choosing the 100 medical websites with the longest time duration that the users tend to spend on them and the 100 websites with the lowest, identification of the factors that can affect this variable was attempted.



By observing Table 28, the factors that present the greatest difference among the medical websites with the longest and shortest time duration spent on them by the users are: the status of the website, the nature/structure of the organizations that are supporting/providing the websites, the interactive services offered, the certification, the source of income, the web presence in social networks and the various estimations based on the traffic of website as well as the percentage of the website’s visits that are referred from search engines.
Table : Metrics of the websites with the longest and shortest time spent on them by the users

Medical websites with the highest percentage of visits referred from search engines

Medical websites with the lowest percentage of visits referred from search engines

Characteristic

Percentage/ Number

Characteristic

Percentage/ Number

Interactive service offering

56%

Interactive service offering

52%

Online websites

84%

Online websites

79%

Specific health problem

16%

Specific health problem

23%

General Health

25%

General Health

50%

Drug info

6%

Drug info

2%

Medical resources

27%

Medical resources

30%

Medical products

6%

Medical products

8%

Medical Education

15%

Medical Education

7%

Kids health

3%

Kids health

4%

Senior Health

2%

Senior Health

1%

Profit organizations

26%

Profit organizations

50%

Non-profit private organizations

57%

Non-profit private organizations

44%

Non-profit governmental

17%

Non-profit governmental

6%

Ask a doctor

5%

Ask a doctor

11%

Find a doc/local care

22%

Find a doc/local care

12%

Drug symptom checker

9%

Drug symptom checker

9%

Other

26%

Other

23%

Social network

52%

Social network

38%

Certification

15%

Certification

24%

Value for patients

69%

Value for patients

70%

Blog

18%

Blog

15%

Average global rank

266,136

Average global rank

37,350,140

Average % of visits referred form search engines

24%

Average % of visits referred form search engines

16.3%

Average time on website

4.4 min

Average time on website

1 min

Average lifetime

11.1 years

Average lifetime

9.5 years

Average unique visits

964,192

Average unique visits

163,321

Average linked websites

780

Average linked websites

180

Governmental funding

22%

Governmental funding

6%

Products/services/shop

26%

Products/services/shop

40%

Donations/sponsors/support from non-profit org./grants

17%

Donations/sponsors/support from non-profit org./grants

21%

Membership/registration

5%

Membership/registration

5%

University funding/research/project funding

21%

University funding/research/project funding

6%

Partners/mother profit org.

2%

Partners/mother profit org.

7%

Advertisements

5%

Advertisements

8%

Average Google rank

6.5

Average Google rank

4.5

Average Facebook likes

587

Average Facebook likes

200

In order to examine the factors from our dataset that can affect the time the users tend to spend on the medical websites, a linear regression model was implemented to test the null hypothesis that the factors from presented the greatest difference between the websites with the longest and shortest time that the users tend to spend on them affect indeed the dependent variable which is going to be the “time the users spend on website” variable. As independent variables/predictors are used the medical website status, the interactive service offering, the web presence in social networks, the global rank, the kind of interactive services, the category in which the medical website belongs to, the web presence of any kind of certification, the number of unique visits and linked websites and the nature/structure of the organization that that supports the medical website as well as the source of income of the websites.

According to the aforementioned criteria of the linear regression model, the first step of preparing the data for the regression process was to check the normality criterion. In the sample the “time the users spend on the website” needed to be transformed in order to follow the normal distribution by estimating the decimal logarithm of its value. The variable presents a high positive Skewness and Kurtosis as it is presented in Figure 27 which presents the histogram and the descriptive statistics of the variable’s values in a standard normal distribution Skewness should be 0 and Kurtosis 3.


Figure : Histograms of the dependent variable before and after the transformation in order to follow a normal distribution

The variable presents a high positive Skewness and Kurtosis as it is presented in Figure 27 which presents the Histogram and the descriptive statistics of the variable’s values in a standard normal distribution Skewness should be 0 and Kurtosis 3.


Table : Descriptive Statistics of the “Log10 time on website” variable

N

307

Missing

9

Mean

.4300

Median

.4771

Std. Deviation

.48

Variance

.20882

Skewness

.044

Std. Error of Skewness

-.347

Kurtosis

.139

Std. Error of Kurtosis

0.785

As it was afore-mentioned, P-P plots and Q-Q plots are also useful in assessing if the data are following a specific distribution.


Figure : P-P and Q-Q plots of the “time spends on website” variable


Although the normality tests indicate non-normality, since we have a large sample with more than 200 observations (N=316) it is considered better to look at the shape of the distribution and the values of the Skewness and Kurtosis rather than estimate their significance (Field, 2009; MVP Programs-Normality Testing Guidelines, accessed 11/1/2012). The z-score statistic for the Skewness and the Kurtosis is considered useful to be calculated since it enables us to compare the values of these metrics in different samples. In order to calculate the z-scores of the Skewness and the kurtosis we will use the formula below (subtracting the mean of the normal distribution that is 0 and dividing the result with the standard error of the Skewness or Kurtosis):


The Zskewness = 0.486 and the Zkurtosis = 2.83, so both values are below the threshold of 3.29 in which the values are significant at p<0.001 so we can assume normality (Field, 2009).

Table : Tests of Normality for the “Log10 time on website” variable



Variable

Kolmogorov-Smirnov

Shapiro-Wilk

Statistic

df

Sig.

Statistic

df

Sig.

LOG10timeonite

.148

307

.000

.949

307

.000

The next step of the analysis involved the assessment of the linear relationship between the dependent and the numeric independent variables by estimating Pearson’s R and Spearman rho correlation Coefficient, as it was performed in the case of the “Global Rank” dependent variable. After summarizing the results on Table 32, it is observed that the independent variables present a statistically significant positive linear relationship with the dependent variable except from the “Global Rank” and the “Percentage of visits referred from search engines” predictors that present a negative linear relationship with the predictor. This means that as the percentage of visits to the website that are referred from search engines increases, the time the user spend on the website tends to decrease. This can be explained assuming that the users get familiar with the website as they visit it more frequently through a search engine and know where to find the information they need.



Table : Pearson’s R and Spearman estimation for the linear relationship between dependent-independent variables



LOG10 “Time spent on the website”

Google Rank

LOG10 “Time spent on the website”

Spearman’s rho Correlation Coefficient

1

-.578**

Sig. (2-tailed)




.000

N

307

279

Google Rank

Spearman’s rho Correlation Coefficient

-.578**

1

Sig. (2-tailed)

.000




N

279

307




LOG10 “Time spent on the website”

Unique Visits

LOG10 “Time spent on the website”

Spearman’s rho Correlation Coefficient

1.000

.346**

Sig. (2-tailed)

.

.000

N

279

261

Unique Visits

Spearman’s rho Correlation Coefficient

.346**

1.000

Sig. (2-tailed)

.000

.

N

261

279




LOG10 “Time spent on the website”

Linked Websites

LOG10 “Time spent on the website”

Spearman’s rho Correlation Coefficient

1.000

.229**

Sig. (2-tailed)

.

.000

N

279

279

Linked Websites

Spearman’s rho Correlation Coefficient

.229**

1.000

Sig. (2-tailed)

.000

.

N

279

316




LOG10 “Time spent on the website”

Fraction of visits from search engines

LOG10 “Time spent on the website”

Pearson’s R Correlation Coefficient

1

-.189**

Sig. (2-tailed)




.003

N

279

253

Fraction of visits from search engines

Spearman’s rho Correlation Coefficient

-.189**

1

Sig. (2-tailed)

.003




N

253

258

After implementing the linear regression model, the results showed that, according to the R-squared value, the independent variables can account for 28.6% of the variation in the dependent variable so there might be also other variables that can explain the variation. ANOVA test presents an evaluation of the ability of prediction of the model. Since the value of the F-test is significant (p<0.01) this means that the model provides a significantly better prediction or improvement than if we had used the mean value of the dependent variable (Table 32). In addition to the above, the value of the Durbin-Watson test for serial correlation since it is substantially above 1 shows that there is independence of errors according to the assumptions of the linear regression.



Table : Linear regression model’s summary for the Log10 “Time spent on website” variable

Linear Regression Model Summary

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

Durbin-Watson

1

.457

.209

.131

.19526

2.143

ANOVA

Model

Sum of Squares

df

Mean Square

F

Sig.

Regression

2.662

26

.102

2.686

.000

Residual

10.103

265

.038







Total

12.765

291









Moreover, after examining the coefficient’s output of the linear regression model (Table 33), it was observed that the percentage of visits to the website referred from search engines, the “specific healthcare issues” website category, the governmental funding as source of income, the unique visits and the Google rank are significant variables/predictors that influence and have an impact on the dependent variable. The sign of the beta coefficient implies that this impact is negative in the case of the “percentage of visits to the website referred from search engines” predictor, implying that as the fraction of visits referred to from search engines increases, the time that the users tend to spend on the website tends to decrease, while for the rest of the significant predictors, the positive sign implies a positive relationship between them and the dependent variable.


Table : Coefficients output

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

B

(Constant)

.388

.128




3.027

.003

Service

.017

.030

.041

.590

.555

Website status

.033

.038

.054

.872

.384

Ask a Doctor

-.063

.046

-.085

-1.353

.177

Find a doctor or local care

.062

.035

.118

1.767

.078

Drug or symptom checker

-.003

.047

-.004

-.061

.951

Social Networks

.020

.026

.049

.800

.424

Certification

.013

.030

.026

.429

.669

Fraction of visits from Search engines

-.004

.001

-.210

-3.552

.000

Profit

-.053

.034

-.122

-1.565

.119

Non-profit governmental

-.107

.065

-.160

-1.656

.099

Specific health issues

-.158

.059

-.323

-2.687

.008

General Health

-.102

.060

-.215

-1.696

.091

Medical resources

-.081

.062

-.172

-1.317

.189

Governmental Funding

.158

.062

.251

2.526

.012

Products services shop

.097

.104

.211

.938

.349

Donations sponsors Grants

.052

.103

.122

.507

.613

Membership

.096

.109

.128

.881

.379

University and research funding

.178

.109

.273

1.635

.103

Partners/ mother profit org.

.133

.122

.110

1.087

.278

Advertisements

.070

.110

.087

.639

.523

Drug info

-.107

.079

-.117

-1.358

.176

Medical products

.003

.086

.002

.030

.976

Medical education

-.126

.073

-.175

-1.734

.084

Unique visits

1.395E-8

.000

.122

1.980

.049

Google Rank

.018

.009

.136

1.997

.047

Linked Websites

6.529E-6

.000

.045

.720

.472

Furthermore, for the purpose of examining the regression residuals for traces of heteroscedasticity and checking also the normality criterion of the residual errors, a scatterplot of the regression standardised residuals with the regression standardised predicted value for the residuals was formed. As it is obvious in the Figure 29 and more specifically in the histogram and the P-P plot, the residuals have very small deviation from the normal distribution, almost insignificant. Moreover, the scatterplot between the regression’s standardised residuals and the regression standardised predicted value for the residuals does not present any specific pattern and this strengthens more the assumption that there is no heteroskedasticity.



Figure : Examining the regression residuals for traces of heteroscedasticity and the normality criterion of the residual errors


Finally, in order to validate and test the generalizability of the linear regression model with a cross-validation analysis, the sample was split in subsamples in a 50% - 50% ratio as it is proposed by Bagley et al (2001), Godfrey (1985), Marill (2004), Schneider et al. (2010), and Field (2009). The validation results have shown that the model can predict the relationship between the dependent and the independent variables, since both subsamples had the same significance and the same significant independent variables and but it cannot predict the strength of this relationship since the R-squared of the one subsample was more than 5% different compared to the R-squared of the other subsample (Table 34).

Table : Cross validation of the “LOG10 Time spent on website” linear model

Model Summary

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

split = .00 (Selected)

split ~= .00 (Unselected)

1

.513

.212

.263

.098

.18467

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

split = .00 (Selected)

split ~= .00 (Unselected)

1

.596

.158

.355

.218

.19759

ANOVA

Model

Sum of Squares

df

Mean Square

F

Sig.

Regression

1.414

26

.054

1.595

.049

Residual

3.956

116

.034







Total

5.370

142










Model

Sum of Squares

df

Mean Square

F

Sig.

Regression

2.624

26

.101

2.585

.000

Residual

4.763

122

.039







Total

7.387

148












18.5Factors that Affect the Offering of Interactive Web Applications/ Services


Aiming to investigate the factors that affect the interactive web applications/ services’ offering from the medical websites, a logistic regression model was applied having as dependent variable the “Interactive applications/services offering” dummy variable. The independent variables/predictors were basic metrics of the medical websites performance, such as:

  • The global rank of the medical websites

  • The lifetime of the medical websites

  • The fraction of the visits to the medical website that were referred from search engines

  • The time that the users tend to spent on the medical website

  • The nature of the organization that supports/provides the medical website

  • The web presence of the medical website in social networks

  • The web presence of any certification

The hypothesis to be tested was that as the global rank of the website moves to better positions and/or the lifetime of the medical website increases and/or the time that the users spent on website increases, and/or the medical website is active in social networks, and/or the website is supported by non-for-profit organizations, the offering of interactive medical web-based services /services increases.

By examining the Logistic Regression Model Summary (Table 35), the values of the pseudo-R squared Cox and Snell R-square and Nagelkerke R-squared were taken under consideration. The Cox and Snell R-squared value reflects the improvement of the full model over the intercept model (the smaller the ratio, the greater the improvement). Since the Cox and Snell's pseudo R-squared has a maximum value that is less than one, the Nagelkerke R-square is used to adjust the value of the Cox and Snell's pseudo R-squared so that the range of possible values extends to 1.

Table : Logistic Regression Model Summary


Logistic Regression Model Summary

Model

2 Log likelihood

Cox & Snell R Square

Nagelkerke R Square

1

288.363

.193

.260

Moreover, the coefficients output (Table 36) shows that the variables that have significant impact on the dependent variable are:



  • The Fraction of visits from search engines

  • The Status (Online/ Offline)

  • The Web presence of a certification from an accredited organization Certification

  • The Category of the medical websites

To conclude, as the fraction of visits referred to from search engines increases, the website is online, belongs to the “drug information” category and present a certification from an accredited organization, the probability to offer interactive medical web-based services /services increases as well.

Table : Coefficients Output



Model

B

S.E.

Wald

df

Sig.

Global Rank of the medical websites

.000

.000

.747

1

.387

Fraction of visits referred from search engines

.033

.015

4.520

1

.034

Time spent on the medical website

.060

.104

.333

1

.564

Lifetime

.051

.039

1.718

1

.190

Status (Online/ Offline) of the medical website

1.380

.508

7.398

1

.007

Certification

.474

.362

1.710

1

.050

Social network’s web precense

.150

.301

.248

1

.619

Specific healthcare issues category

1.558

1.313

1.406

1

.236

General Health category

2.173

1.320

2.708

1

.100

Drug information category

2.947

1.534

3.690

1

.049

Medical Resources category

.860

1.312

.430

1

.512

Medical product category

-.546

1.711

.102

1

.749

Medical education category

1.126

1.359

.686

1

.407

Children health

1.743

1.479

1.388

1

.239

Non-for-Profit organization

-.073

.496

.022

1

.883

Profit organization

.044

.508

.007

1

.931

Constant

-4.051

1.629

6.182

1

.013




Download 1.95 Mb.

Share with your friends:
1   ...   15   16   17   18   19   20   21   22   ...   35




The database is protected by copyright ©ininet.org 2024
send message

    Main page