CA3LE Posted August 17, 2017 CID Share Posted August 17, 2017 Some pages are there but many are not and most importantly the index isn't showing up. Quote Link to comment Share on other sites More sharing options...
mudmanc4 Posted August 17, 2017 CID Share Posted August 17, 2017 I would default to looking after an IPB metadata issue CA3LE 1 Quote Link to comment Share on other sites More sharing options...
CA3LE Posted August 17, 2017 Author CID Share Posted August 17, 2017 Why do you say that? Quote Link to comment Share on other sites More sharing options...
mudmanc4 Posted August 17, 2017 CID Share Posted August 17, 2017 If I recall, there have been issues along these lines in the past. More than once. Quote Link to comment Share on other sites More sharing options...
nanobot Posted August 17, 2017 CID Share Posted August 17, 2017 3 hours ago, CA3LE said: Some pages are there but many are not and most importantly the index isn't showing up. Have you looked at the Google Search Console? https://www.google.com/webmasters/tools/ It usually has good information as to why pages aren't indexed, or are prioritized lower than others. Thanks, EBrown CA3LE 1 Quote Link to comment Share on other sites More sharing options...
Sean Posted August 17, 2017 CID Share Posted August 17, 2017 One possibility is that Google sees too many versions of the index, e.g. uk.testmy.net, dallas.testmy.net, etc. and only lists a few variants. I remember this being a pain in the past in the Joomla 1.x days where it would show the same page under various URLs and Google usually ended indexing obscure URL variations of some pages. Both Bing and DuckDuckGo have the main test.my homepage URL indexed, so it doesn't seem to be something preventing crawlers from indexing it. One thing I suggest is adding a Canonical meta header tag to the home page to specify "https://testmy.net" as the preferred URL, as explained here. CA3LE 1 Quote Link to comment Share on other sites More sharing options...
CA3LE Posted August 17, 2017 Author CID Share Posted August 17, 2017 2 hours ago, nanobot said: Have you looked at the Google Search Console? https://www.google.com/webmasters/tools/ It usually has good information as to why pages aren't indexed, or are prioritized lower than others. Thanks, EBrown That was the first place I looked. No messages. No issues coming up. 6 minutes ago, Sean said: One possibility is that Google sees too many versions of the index, e.g. uk.testmy.net, dallas.testmy.net, etc. and only lists a few variants. I remember this being a pain in the past in the Joomla 1.x days where it would show the same page under various URLs and Google usually ended indexing obscure URL variations of some pages. Both Bing and DuckDuckGo have the main test.my homepage URL indexed, so it doesn't seem to be something preventing crawlers from indexing it. One thing I suggest is adding a Canonical meta header tag to the home page to specify "https://testmy.net" as the preferred URL, as explained here. that's definitely a strong possibility, I edited the robots.txt on those servers to disallow. Just did the canonical header, great suggestion. expires header is set to nocache and even though I changed session.cache_limiter to public in php.ini it's still showing Thu, 19 Nov 1981 08:52:00 GMT Maybe a setting in nginx I'm missing. Quote Link to comment Share on other sites More sharing options...
nanobot Posted August 17, 2017 CID Share Posted August 17, 2017 I wonder if it could potentially be a performance issue? A basic network test in Chrome shows a lot of traffic loading past the 0.5s - 1.0s mark, Google puts a weight on performance as well. I have noticed the past few days that it has feel somewhat laggy here - on my phone it takes 10-15s to do a full refresh over LTE or WIFI. I also know the canonical tag issue mentioned by @Sean can have a negative effect if it's not properly set - Google may be indexing all the different variants, but the main page is not shown as it sees it as a "duplicate" of the uk.testmy.net page. Also, the EPOCH time for that expires is `375007920`. I also wonder if there aren't bigger issues - my search results (attached) for even `testmy.net` only have TMN as the top 6 results, which makes me think Google doesn't have a bigger problem with something. Thanks, EBrown Quote Link to comment Share on other sites More sharing options...
CA3LE Posted August 17, 2017 Author CID Share Posted August 17, 2017 Yeah, I don't know what's going on. I'd expect a message in the webmaster console if there was something wrong. Performance could have been cloudflare. I disabled them last night, had only been running them since that recent net neutrality deadline... to run that plugin. Wondering if enabling that had anything to do with it. Maybe cloudflare passed the expires header off differently and it made Google think I didn't want the pages cached. I submitted for reindexing. Hopefully it's fixed soon. -- Come on Google! I hope it's just an issue with those other servers redirecting to testmy.net. The redirects should have included 301 header and also the canonical meta. If you can find anything I'm doing wrong from Google's perspective please let me know. I'm just trying to help people over here by providing a free and hopefully useful service. Between stuff like this, ad spammers and people trying to hack me constantly --- it makes it real difficult to do for the Internet what I really want. Quote Link to comment Share on other sites More sharing options...
Sean Posted August 17, 2017 CID Share Posted August 17, 2017 I just noticed on the https version of the site that the homepage robots meta tag has a 'noindex': I'm not entirely sure how Google indexes a duplicate URL path if a https version has 'noindex' in the page's robots meta tag, but not the other. What I suspect is that when Googlebot sees the 'noindex' in the robots meta tag of the https page, it drops the corresponding http URL page. The canonical meta tag to the http version you added should be sufficient to index the http version in preference over the https version. Quote Link to comment Share on other sites More sharing options...
CA3LE Posted August 17, 2017 Author CID Share Posted August 17, 2017 5 hours ago, Sean said: I just noticed on the https version of the site that the homepage robots meta tag has a 'noindex': Originally when I turned https on it was because you wanted to test on SSL and port 8080... I didn't necessarily want the search engines spidering what we hadn't even tested. I think Google is going to weigh https sites heavier in the future so I'm moving the site over to https. They already are but I expect it's going to be even more so. I didn't realize the noindex was still in there... it's now controlled correctly with a separate robots.txt using mod_rewrite to switch the file. RewriteCond %{SERVER_PORT} ^443$ RewriteRule ^robots\.txt$ robots_ssl.txt [L] ...but I've already disabled that in preparation. I'll make sure to include a toggle in the new settings with the option to switch between http and https. But the site itself will completely run on SSL, regardless of the option selected. And visiting https://testmy.net will no longer trigger that option. It's really already done, I just want to make sure before I flip it over that I'm not missing anything that will affect pagerank negatively. I don't think the noindex on the https version was causing the issue though. TestMy.net is showing up on Google again, before I removed that line. Could have been one or a combination of the issues we previously talked about. I'll keep working to find my mistakes. :-/ -- one constant is human error. Sean 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.