Coolthing Of Theday

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Monday, 18 March 2013

Stupid Feed Tricks to amaze your producers, consumers, readers and parsers [not!]

Posted on 17:08 by Unknown

inessential - Brian’s Stupid Feed Tricks

At NewsGator and Sepia Labs I worked with Brian Reischl, one of the server-side guys. Among other things, he worked on NewsGator’s RSS content service, which reads n million feeds once an hour.

(I don’t know if I can say what n is. It surprised me when I heard it. The system is still running, by the way.)

Brian is intimately acquainted with the the different ways feeds can be screwed up. So he posted Stupid Feed Tricks on Google Docs...

Stupid Feed Tricks

Stupid HTTP Tricks

  1. When the feed is gone/errored, publisher may still return a 200 OK but send an HTML page instead.

  2. Using permanent redirects for temporary errors. In one instance, all the Microsoft blogs had a temporary system error. All the feeds did a permanent redirect to the same system error page, and we updated all 40,000 feeds to point to that one URL. Whoops.

...

Stupid XML Tricks

  1. Any sort of XML well-formedness error you can think of. Missing closing tags, mismatched tags, bad escaping, not quoting attributes, missing root elements.

  2. Including unescaped HTML content inside a tag - which sort of works, except that most HTML isn’t XML-compliant.

  3. ...

Stupid RSS/Atom Tricks

  1. Missing any element you can think of.

  2. Adding custom elements without namespaces.

  3. ...

Other Stupid Tricks

  1. Updating posts very frequently. Newspapers are very fond of this. In 4 hours they might change a post 12 times, by the end it might have nothing in common with the original article (completely different title, completely different body). Sometimes combined with not using lastUpdated, or just not changing lastUpdate.
  2. Publishing updated posts as new posts, so you have 12 versions of the same post in the feed.
  3. ....

Random Notes

  1. You should think hard about canonicalization of URLs. Some parts of the URL can be case-sensitive (path and query) other parts can’t (protocol, host and port). Users (and webmasters) will absolutely use different upper/lower casing in different places.
  2. If you build a database index on FeedUrl, consider that 99% of them start with “http://”, which makes for a shitty index. Consider separating the protocol into its own column, and then indexing on the remainder of the URL. Alternatively, you could index on a hashed value of the URL. Theoretically you could have collisions, but in practice there are not that many feeds.

image

Since we're all about RSS this past week'ish and the fact that many might again play in the RSS space, I thought this document great, from someone who's really been there, done that...

Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in WebFeed | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Mr. 7,000! This is my 7,000th post...
    Before this post; After; 20 visits between taking these snaps? Oh wait, that's probably me searching for past related posts....
  • "Windows Server Essentials Media Pack" (DNLA Stream, HTML5 and Dashboard Media stuff)
    Microsoft Downloads - Windows Server Essentials Media Pack This pack enables the media streaming functionality for Windows Server 2012...
  • Rad Gate Post... Get your Red Gate Post here...
    simple talk - Melanie Townsend - Get a copy of the Red Gate Post We recently put together a newspaper of some of the best articles fr...
  • Windows Management Framework 4.0 (PowerShell 4, PowerShell ISE, Management OData, WMI, etc.) now available
    Keith Hill's Blog - PowerShell 4.0 Now Available You can get PowerShell 4.0 for down level operating systems now via the WMF 4.0 d...
  • Viasfora - Your new favorite Visual Studio Text/*ML Editing Extension?
    Winterdom - Introducing Viasfora A couple of days ago, I unveiled Viasfora , my latest attempt at building a decently packaged extensi...
  • "Windows Server [2012 R2]: The Best Infrastructure to Run Linux Workloads"
    In the Cloud - What’s New in 2012 R2: Enabling Open Source Software Part 4 of a 9-part series . ... There are a lot of great s...
  • [Hardware Review] Life with Haswell... Haswell/Harris Beach Intel SDS Ultrabook Review - Part 2
    "So Greg, how's life with Haswell been?" "Pretty Sweet! (Mostly)" I've been given an opportunity to review t...
  • Fuzzy Lookup Add-In for Excel (Insert lame "Fuzzy, wuzzy was an Excel..." snip here)
    Microsoft Downloads - Fuzzy Lookup Add-In for Excel The Fuzzy Lookup Add-In for Excel performs fuzzy matching of textual data in Exce...
  • Caliburn.Micro v1.5.0 released (CM gets Tasks, Async/Await and Share/Setting for RT... and bug fixes of course)
    Caliburn.Micro - Caliburn.Micro v1.5.0 "Release Notes This release fixes many bugs. It also adds support for Task and async/a...
  • Just about everything you ever wanted to know about SQL Server Date and Time Data Types...
    CodeProject - Date and Time Data Types and Functions - SQL Server (2000, 2005, 2008, 2008 R2, 2012) Introduction It would be bette...

Categories

  • .Net
  • 3DPrinting
  • AFeedYouShouldRead
  • Agile
  • ALM
  • Amazon
  • Amiga
  • Analytics
  • Android
  • ASP.NET
  • Azure
  • BigData
  • bing
  • Blogging
  • Book
  • BookReview
  • BUILD
  • C
  • C#
  • C++
  • Career
  • Cat
  • cheatsheet
  • ClickOnce
  • Cloud
  • ComputerHardware
  • css
  • Data
  • DBA
  • DependencyInjection
  • Deployment
  • Design
  • Development
  • devops
  • DVCS
  • ebook
  • EDD
  • Education
  • EnterpriseLibrary
  • EntityFramework
  • Exchange
  • Expression
  • gadget
  • Game
  • GIT
  • Google
  • Government
  • Hadoop
  • hardware
  • HardwareReview
  • HaswellReview
  • HTML5
  • Humor
  • IE
  • IEExtension
  • IfAllElseFails
  • IIS
  • ILMerge
  • Image
  • Infographic
  • interview
  • InversionOfControl
  • Java
  • Javascript
  • Kinect
  • LightSwitch
  • LINQ
  • Linux
  • LosAngeles
  • Lucene
  • Lync
  • MEF
  • Metro
  • MicrosoftOffice
  • MicrosoftOutlook
  • Mono
  • MVC
  • MVVM
  • NetMon
  • NLP
  • NoSQL
  • NuGet
  • OData
  • OneNote
  • OpenXML
  • Paint.Net
  • Personal
  • Photosynth
  • Physics
  • portable
  • Poster
  • PowerShell
  • Preparedness
  • Presentation
  • Prism
  • PrivateCloud
  • RegEx
  • RemoteDesktop
  • Reporting
  • RIAServices
  • Science
  • ScienceFiction
  • Scratch
  • Scrum
  • ServiceBus
  • SharePoint
  • Silverlight
  • SimiValley
  • SPA
  • Space
  • SQLServer
  • Storyboard
  • Surface
  • SVG
  • SystemAdministration
  • T4
  • TeamBuild
  • TeamFoundationServer
  • TechEd
  • Training
  • TypeScript
  • UnitTesting
  • UnityApplicationBlock
  • Utility
  • Veteran
  • VirtualMachine
  • Visio
  • VisualBasic
  • VisualStudio
  • WCF
  • Web X.X
  • Webcast
  • WebFeed
  • WebMatrix
  • Windows
  • Windows7
  • Windows8
  • Windows8.1
  • WindowsHomeServer
  • WindowsLiveWriter
  • WindowsPhone
  • WindowsServer
  • WinRT
  • WiX
  • WMI
  • WOPI
  • WPF
  • XAML
  • XBox360
  • XboxOne
  • zombie

Blog Archive

  • ▼  2013 (500)
    • ►  December (12)
    • ►  November (61)
    • ►  October (65)
    • ►  September (38)
    • ►  August (47)
    • ►  July (75)
    • ►  June (39)
    • ►  May (40)
    • ►  April (42)
    • ▼  March (39)
      • Goodreads Amazon ... Amazon has bought Goodreads
      • 40,785 Microsoft Patents (and counting). See all t...
      • What RPG has eight different bosses, achievements,...
      • RU 4 U - Registry Usage (RU) v1.0 released. Comman...
      • LifeHacker step by step guide to get going with yo...
      • Send To Send To... How to use Send To to add new i...
      • BUILD 2013 Announced - June 26-28 in San Francisco...
      • 14 Azure whitepapers, ebooks and guidance...
      • PresentOn <-> PresentOff - Using the Productivity ...
      • Oh sheet... I mean, Oh Spritesheet Export plugin f...
      • Need an ADS [Alternate Data Streams] Refresher?
      • You CAN eat this paste... well... kind of. Pretty ...
      • Think maybe you do CTRL-V code too much? Here's a ...
      • If CTRL-V is your coding friend (admitted or not),...
      • Exchange Online getting serious about helping with...
      • DID you see Dean's Icons for Dev's Round-up?
      • BUILD Lumia 920's Get Protico
      • Making a cloudy Windows Phone 8 Lockscreen with th...
      • Caliburn.Micro v1.5.0 released (CM gets Tasks, Asy...
      • Are you smarter than... a C# Rookie?
      • Image Tools Extension for Visual Studio - Quick im...
      • Congratulations you can become a Microsoft Licensi...
      • NHunspell v1.1.0 released (Think "Hunspell for .Ne...
      • Stupid Feed Tricks to amaze your producers, consum...
      • Prism? Infragistics? Dock and Ribbon? Oh my...
      • From Images to Icons, a simple C# example
      • Using Google Docs to find a Google Reader replacem...
      • Web Feed Reader Wish List - My Must Have/Should Ha...
      • And so dies my desktop feedreader of choice too, G...
      • Google Reader Bytes the Dust - Google Reader is be...
      • I HAZ ICONZ MONSTRZ- 1309+ Free icons from iconmon...
      • Now this is the kind of vinyl I can appreciate... ...
      • Pre for free... Microsoft provides a promo code to...
      • Jean-Sébastien lobs the Windows Phone 8 Company Po...
      • 55+ Windows 8 App Templates... 55 "Getting Your Wi...
      • Track the carbon footprint of a hashtag (aka Tweet...
      • Ignite your web dev with the Infragistics Ignite U...
      • Scrum, Anime style... (Yes, an Anime Scrum overvie...
      • Windows Phone 8 Finally Gets Photosynth (and it's ...
    • ►  February (42)
Powered by Blogger.

About Me

Unknown
View my complete profile