Economic Evaluation of an Investment in Medical Websites and Medical Web-Based Services


Other Factors that Affect the Excellence and the Survivability of the Medical Websites



Download 1.95 Mb.
Page18/35
Date02.02.2017
Size1.95 Mb.
#15326
1   ...   14   15   16   17   18   19   20   21   ...   35

18.4Other Factors that Affect the Excellence and the Survivability of the Medical Websites








Factors that Affect the Frequency of Web Presence of the Medical Websites in Search Engines

The frequency of web presence of the medical websites in the first pages of various search engines (Google.com, Yahoo.com etc.) can prove to be a crucial factor for their success, since medical websites can attract more users. In order to identify the factors that have a significant impact on the web presence of the medical websites in search engines, the same methodological steps will be applied, using as criterion-variable the values of the “Fraction of visits to the medical website referred to from search engines”.



By ranking the medical websites according to the fraction of their visits referred from search engines, in order to form the aforementioned final taxonomies of the “highest ranked of the highest ranked” and “lowest ranked of lowest ranked” web medical websites and choosing the 100 with the highest percentage of visits referred to from search engines and the 100 with the lowest, it is observed that almost the same factors that presented a great difference between the final taxonomies present a great difference in this case too.
Table : Metrics of the websites with the highest and lowest percentage of visits referred to from search engines

Medical websites with the highest percentage of visits referred from search engines

Medical websites with the lowest percentage of visits referred from search engines

Characteristic

Percentage/ Number

Characteristic

Percentage/ Number

Interactive service offering

74%

Interactive service offering

53%

Online websites

96%

Online websites

78%

Specific health problem

30%

Specific health problem

18%

General Health

31%

General Health

27%

Drug info

6%

Drug info

4%

Medical resources

22%

Medical resources

28%

Medical products

2%

Medical products

13%

Medical Education

4%

Medical Education

8%

Kids health

3%

Kids health

1%

Senior Health

2%

Senior Health

1%

Profit organizations

33%

Profit organizations

48%

Non-profit private organizations

51%

Non-profit private organizations

45%

Non-profit governmental

16%

Non-profit governmental

7%

Ask a doctor

13%

Ask a doctor

6%

Find a doc/local care

23%

Find a doc/local care

17%

Drug symptom checker

19%

Drug symptom checker

4%

Other

29%

Other

27%

Social network

54%

Social network

42%

Certification

26%

Certification

22%

Value for patients

72%

Value for patients

66%

Blog

21%

Blog

15%

Average global rank

440632

Average global rank

40668455

Average % of visits referred form search engines

37.4%

Average % of visits referred form search engines

4.6%

Average time on website

3 min

Average time on website

2 min

Average lifetime

11.4 years

Average lifetime

11.1 years

Average unique visits

890911

Average unique visits

326175

Average linked websites

927

Average linked websites

382

Governmental funding

20%

Governmental funding

8%

Products/services/shop

22%

Products/services/shop

40%

Donations/sponsors/support from non-profit org./grants

26%

Donations/sponsors/support from non-profit org./grants

18%

Membership/registration

9%

Membership/registration

7%

University funding/research/project funding

2%

University funding/research/project funding

10%

Partners/mother profit org.

7%

Partners/mother profit org.

4%

Advertisements

9%

Advertisements

9%

Average Google rank

6.3

Average Google rank

5.4

Average Facebook likes

892

Average Facebook likes

218

In order to examine the factors from our dataset that can affect that web presence on search engines or the fraction of visits to the websites that are referred from them a linear regression model will be implemented. Through this model, it is aimed to test the null hypothesis that the factors that presented greater difference between the medical websites with the highest percentage of visits referred to from search engines and the websites with the lowest affect indeed the dependent variable which is going to be the “Fraction of visits referred from search engines” variable. As independent variables/predictors will be used:



  • The website status (online/offline)

  • The interactive medical web-based services/applications offering

  • The web presence of the medical website in social networks

  • The global rank

  • The kind of interactive services

  • The category in which the medical website belongs to according to its content and services

  • The web presence of any kind of certification accredited to the medical website by well-known organizations

  • The number of unique visits and linked websites to the medical website

  • The nature/structure of the organization that supports and provides the medical website

  • The source of generated revenue stream from the medical website operations

According to the Aforementionedcriteria of the linear regression model, the first step of preparing the data for the application of the linear regression model is to check if the dependent variable is following a normal distribution. In our sample the “Fraction of visits referred from search engines” variable presents a high positive Skewness and Kurtosis as it is presented in Figure 24 which presents the histogram and the descriptive statistics of the variable’s values. In a standard normal distribution Skewness should be 0 and Kurtosis 3.




N

301

Missing

15

Mean

26.6049

Median

25.5000

Std. Deviation

10.78403

Variance

116.295

Skewness

-.038

Std. Error of Skewness

.140

Kurtosis

-.438

Std. Error of Kurtosis

.280

Figure : Histogram and descriptive statistics for the “Fraction of visits referred from search engines” variable

As it was also mentioned in previous sections, P-P plots and Q-Q plots are also useful in assessing if the data are following a specific distribution.

Figure : P-P and Q-Q plots of the “Fraction of visits referred from search engines” variable






Although the normality tests indicate, in the case of Kolgomorov- Smirnof test, non-normality and, in the case of Shapiro-Wilks test, normality, since the p-value of the second is insignificant at a 5% level and thus accepts the null hypothesis that there is normality, since we have a large sample with more than 200 observations (N=316) it is considered better to look at the shape of the distribution and the values of the Skewness and Kurtosis, rather than estimate their significance (Field, 2009; MVP Programs-Normality Testing Guidelines, accessed 11/1/2012). The z-score statistic for the Skewness and the Kurtosis is considered useful to be calculated since it enables us to compare the values of these metrics in different samples. According to Field (2009), “the z-score is simply a score from a distribution that has a mean of 0 and a standard deviation of 1”. In order to calculate the z-scores of the Skewness and the Kurtosis, we will use the formula below subtracting the mean of the normal distribution that is 0 and dividing the result with the standard error of the Skewness or Kurtosis.
The Zskewness = 0.2714 and the Zkurtosis = 1.56, so both values are below the threshold of 3.29 in which the values are significant at p<0.001, so we can assume normality (Field, 2009).
Table : Tests of Normality for the “Fraction of visits referred from search engines” variable

Variable

Kolmogorov-Smirnov

Shapiro-Wilk

Statistic

df

Sig.

Statistic

df

Sig.

LG10Global Rank

.066

301

.003

.992

301

.106

The next step of the data analysis process involves the assessment of the linear relationship between the dependent and the numeric independent variables by estimating Pearson’s R and Spearman’s rho correlation coefficients as it was performed in the case of the “Global Rank” dependent variable (Table 24). Variables such as the “Global Rank”, the “Google Rank”, “Lifetime” and the “Reputation” did not seem to present any significant linear relationship even after the transformation and we will avoid to use them in the analysis since the model will lose part of its explanatory strength.

Table : Pearson’s R estimation for the dependent-independent variables linear relationship


Fraction of visits referred from search engines

Percentage of total Internet users visiting the medical website

Fraction of visits referred from search engines

Spearman’s rho Correlation Coefficient

1.000

-.089*

Sig. (2-tailed)

.

.037

N

258

257

Percentage of total Internet users visiting the medical website

Spearman’s rho Correlation Coefficient

-.089

1.000

Sig. (2-tailed)

.037

.

N

257

258




Fraction of visits referred from search engines

Time spent on the website

Fraction of visits referred from search engines

Spearman’s rho Correlation Coefficient

1.000

-.226**

Sig. (2-tailed)

.

.000

N

258

253

Time spent on the website

Spearman’s rho Correlation Coefficient

-.226**

1.000

Sig. (2-tailed)

.000

.

N

253

258




Fraction of visits referred from search engines

Lifetime

Fraction of visits referred from search engines

Spearman’s rho Correlation Coefficient

1.000

-.128**

Sig. (2-tailed)

.

.025

N

307

307

Lifetime

Spearman’s rho Correlation Coefficient

-.128*

1.000

Sig. (2-tailed)

.025

.

N

307

307




Fraction of visits referred from search engines

Number of Unique Visits

Fraction of visits referred from search engines

Spearman’s rho Correlation Coefficient

1.000

-.178**

Sig. (2-tailed)

.

.005

N

258

242

Number of Unique Visits

Spearman’s rho Correlation Coefficient

-.026

1.000

Sig. (2-tailed)

.679

.

N

242

258




LG10 Global Rank

Linked Websites

LG10 Global Rank

Spearman’s rho Correlation Coefficient

1.000

-.228*

Sig. (2-tailed)

.

.032

N

258

258

Linked Websites

Spearman’s rho Correlation Coefficient

-.228*

1.000

Sig. (2-tailed)

.032

.

N

258

258

What seems interesting is the fact that the “Percentage of the total users visiting the medical website” has a significant negative linear correlation with the dependent variable which means that as the fraction of visits refereed from search engines increases, the total percentage of the Internet users that are visiting the website tend to decrease, and vice versa. This can be explained by taking into consideration the fact that well-known medical websites that attract a large number of Internet users as visitors/users/patients do not rely on the in web presence in search engines, but their quality is advertised through other ways such as in forums and conversations among the users or even via other websites that include links of these medical websites.



The next step of the analysis involved the implementation of the linear regression model. The R-squared value shows that the independent variables can account for 28.6% of the variation in the dependent variable so there might be also other variables that can explain the variation. ANOVA present an evaluation of the degree of prediction of the model. Since the value of the F-test is significant (p<0.01) this means that the model provides a significantly better prediction or improvement than if we had used the mean value of the dependent variable (Table 25). “Adjusted R-squared indicates the loss of predictive value for the model and it shows the variance accounted on the dependent variable if the model had been derived from the population the sample was taken” (Field, 2009). In addition to the above the value of the Durbin Watson test for serial correlation, since it is substantially above 1, shows that there is independence of errors according to the assumptions of the linear regression.
Table : Linear regression model summary for the “Percentage of visits referred from search engines” variable

Linear Regression Model Summary

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

Durbin-Watson

1

.535a

.286

.202

9.67895

1.935

ANOVA

Model

Sum of Squares

df

Mean Square

F

Sig.

Regression

8234.747

26

316.721

3.381

.000

Residual

20516.389

219

93.682







Total

28751.135

245









Examining the output of the linear regression model, it is observed that the current status of the medical website (online/offline), the governmental funding as source of income, the web presence of drug/symptom checkers as interactive medical applications, the percentage of the total Internet users that is visiting the website and the time that they tend to spend on it are significant variables/predictors that influence and have an impact on the dependent variable. The sign of the beta coefficient implies that this impact is negative in the case of time on website predictor implying that as the fraction of visits referred to from search engines increases the time that the users tend to spend on the website tends to decrease, while for the rest of the significant predictors the positive sign implies a positive relationship between them and the dependent variable.



Table : Percentage of visits referred from search engines” variable’s Linear regression outputs

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

B

(Constant)

26.286

6.870




3.826

.000

Service

2.803

1.608

.125

1.743

.083

Website status

6.341

2.854

.131

2.222

.027

Ask a Doctor

-2.525

2.334

-.072

-1.081

.281

Find a doctor or local care

-1.078

1.856

-.042

-.581

.562

Drug or symptom checker

5.659

2.510

.164

2.254

.025

Social Networks

.415

1.397

.019

.297

.767

Certification

-1.126

1.586

-.045

-.710

.478

Percentage of total Internet users

-.963

.484

-.132

-1.990

.048

LOG10 unique visits

1.503

.859

.149

1.749

.082

LOG10 Linked websites

-.846

1.202

-.060

-.704

.482

Specific health issues

-1.187

3.117

-.048

-.381

.704

General Health

-.763

3.147

-.031

-.242

.809

Medical resources

-1.966

3.257

-.077

-.604

.547

Profit

-2.791

1.930

-.123

-1.446

.150

Non-profit Governmental

-6.064

3.487

-.172

-1.739

.083

Governmental Funding

10.323

3.450

.317

2.992

.003

Products/services/shop

-1.872

5.237

-.079

-.358

.721

Donations/sponsors/grants

-2.804

5.234

-.126

-.536

.593

Membership

2.669

5.570

.070

.479

.632

University and research Funding

-1.945

5.512

-.057

-.353

.725

Partners/ mother profit org.

5.943

6.287

.098

.945

.346

Advertisements

-4.368

5.560

-.105

-.786

.433

Drug info

-7.154

4.265

-.153

-1.678

.095

Medical products

-4.999

4.748

-.082

-1.053

.294

Medical education

-3.258

3.800

-.088

-.857

.392

Time on website

-2.460

.453

-.357

-5.436

.000

Examining the regression residuals for traces of heteroscedasticity and checking also the normality criterion of the residual errors, the distribution of the regression residuals was examined and a scatterplot of the regression standardised residuals with the regression standardised predicted value for the residuals was formed. As it is obvious in the Figure 26 and more specifically on the histogram and the P-P plot, the residuals have very small deviation from the normal distribution almost insignificant. Moreover, the scatterplot between the regressions’s standardised residuals and the regression standardised predicted value for the residuals does not show any specific pattern and this strengthens more the assumption that there is no heteroskedasticity.


Figure : Examining heteroscedasticity and normal distribution of errors in linear regression model



In order to validate and test the generalizability of the linear regression model with a cross-validation analysis, we split the sample in subsamples in a 50% - 50% ratio as it is proposed by Bagley et al (2001), Godfrey (1985), Marill (2004), Schneider et al. (2010) and Field (2009). The validation results have shown that the model can predict the relationship between the dependent and the independent variables since both subsamples had the same significance and the same significant independent variables and it can also predict the strength of this relationship since the R-squared of the one subsample was not more than 5% different compared to the R-squared of the other subsample (Table 27).
Table : Cross validation of the linear regression model for the “Percentage of visits referred from engines” variable

Model Summary

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

split = .00 (Selected)

split ~= .00 (Unselected)

1

.613

.345

.376

.198

10.06533

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

split = .00 (Selected)

split ~= .00 (Unselected)

1

.612

.250

.375

.214

9.29657

ANOVA

Model

Sum of Squares

df

Mean Square

F

Sig.

Regression

5554.046

26

213.617

2.109

.005

Residual

9219.282

91

101.311







Total

14773.328

117










Model

Sum of Squares

df

Mean Square

F

Sig.

Regression

5230.055

26

201.156

2.327

.001

Residual

8729.055

101

86.426







Total

13959.110

127













Download 1.95 Mb.

Share with your friends:
1   ...   14   15   16   17   18   19   20   21   ...   35




The database is protected by copyright ©ininet.org 2024
send message

    Main page