[Fiware-lab-federation-nodes] [CESNET #134122] Re: experiences with HA

Sean Murphy murp at zhaw.ch
Thu Nov 19 09:58:09 CET 2015


Hi Fanis, all,

Spain has deployed HA had issues and reverted to single controller in Juno.
> In Kilo they have deployed HA with DVR but they had issues and they
> reverted to legacy routers (which of course cancels "pure" HA).
>

So, to be clearer on this: we think Spain has done a Kilo/HA deployment
without DVR.

Having 'legacy' routers within some kind of failover mechanism still looks
better than having
only one router: I know you had problems with this in the past - do you
know if these problems
have been solved?

Giuseppe has deployed Kilo with HA (& DVR???) in a lab only environment and
> it seems stable. Is the lab environment on real hardware or Virtual??
>

Iiuc, Guiseppe indicated that using DVR was risky and basically advised
against it for production.


> I also think there is a confusion between DVR and the L3 agent. In my
> opinion an L3 agent can be in HA without the routers being run as DVR. The
> case with this setup is that something like what happened to me (L3 agent
> failovered but did not "carry" the L3 router/namespace information with
> him) can easily happen again.


My understanding was that this is exactly the VRRP case that was (more or
less) suggested in
the confcall.


> DVR creates an active/standby scenario where if a node fails a router that
> resides on another node will just revert to Active state and keep on
> routing the traffic.
>



> I found loads of insightful and valuable information in this blog
> http://assafmuller.com/. I hope we can further this discussion since i
> think it is for the good of the project and it will eventually lead to
> better/more stable implementations.
>

I strongly agree with this - information sharing on these important points
is v important.

BR,
Seán.



> Best regards,
> Fanis
>
>
>
>  From:   José Ignacio Carretero <
> joseignacio.carreteroguarde at telefonica.com>
>  To:   Giuseppe Cossu <giuseppe.cossu at create-net.org>
>  Cc:   "fiware-lab-federation-nodes at lists.fiware.org" <
> fiware-lab-federation-nodes at lists.fiware.org>, Cristian Cristelotti <
> cristian.cristelotti.coll at trentinonetwork.it>
>  Sent:   18/11/2015 12:00 PM
>  Subject:   Re: [Fiware-lab-federation-nodes] [CESNET #134122] Re:
> experiences with HA
>
>
>  The problem with legacy routers is HA.
>
>  Regards,
>  José Ignacio.
>
>
> El 18/11/15 a las 10:58, Giuseppe Cossu escribió:
>
> Jose',
> indeed the official OpenStack documentation reports that "the Kilo release
> increases stability and reliability of DVR considerably over the Juno
> release".
>
>
> Anyway as you reported if the legacy routers are stable, I don't see any
> problems using them.
>
>
> Thanks for your feedback.
>
>
> Regards,
> Giuseppe
>
>
> On Wed, Nov 18, 2015 at 10:03 AM, José Ignacio Carretero <
> joseignacio.carreteroguarde at telefonica.com>  wrote:
>
> Hi,
>
>  That was what we thought: DVR seemed to be a good solution for HA, and
> this way we configured Spanish node. The fact is that it didn't work and we
> had so many problems with DVR. I really don't think this technology is
> mature yet.
>
>  Spain2 node is configured to use DVR routers, however we're actually
> using Legacy routers only because Distributed routers were instable.
>
>  Regards,
>  José Ignacio.
>
>
> El 17/11/15 a las 14:25, Giuseppe Cossu escribió:
>
>
>
> Hi all,
> I want to share with you this link that lists the deployment scenario of
> Neutron: http://docs.openstack.org/networking-guide/deploy.html
> As I said the main problems using HA in OpenStack were related to Neutron,
> that's because the L3 agent was configured in active/passive and it was
> actually not ready to be really in HA. For that reason the OpenStack
> community has developed the DVR (introduced  on Juno) that - on paper -
> solves many issues related to Neutron. For sure it overcomes many Neutron
> architecture limitation (performance, scalability, bottleneck of the
> networking node).
>
>
>
> I can confirm from my direct experience that Juno with legacy L3 agent is
> quite stable in a production environment.
> Regarding Kilo I would suggest to use DVR - but - as Fanis stated, there
> could be some unexpected issues... so it is up the the IOwner select the
> wise thing to do.
>
>
> NOTE: using Fuel 7.0 you don't have the possibility to choose between
> with-HA/without-HA. It deploys an HA environment, so using FUEL you have to
> manage the Corosync/Pacemaker cluster. That means that also Neutron is
> installed in HA.
> FUEL 7.0 have an additional option regarding the Neutron installation: you
> can choose to use or not DVR (if you not select DVR, the legacy L3 agent is
> used).
>
>
> Regarding the OpenStack architecture and procedures using HA, Mirantis
> offers a very useful documentation
> https://docs.mirantis.com/openstack/fuel/fuel-7.0/#guides  . In
> particular regarding the HA:
> https://docs.mirantis.com/openstack/fuel/fuel-7.0/operations.html and
> https://docs.mirantis.com/openstack/fuel/fuel-7.0/reference-architecture.html#multi-node-with-ha-deployment
>
>
> Regards,
> Giuseppe
>
>
>
>
> On Tue, Nov 17, 2015 at 1:17 PM, Sean Murphy  <murp at zhaw.ch> wrote:
>
> Hi again all,
>
>
> To follow up on this after the discussion on the confcall this morning
> (which
> I found v useful - it might be good if we have more discussion of these
> important issues on the calls from time to time).
>
>
> It was not clear to me the status of the Spanish node: I did not concretely
> understand what Fernando said regarding HA. From previous communication,
> I understand that they chose not to use HA in Juno; in the meetings of the
> minutes from today, I see
>
>
> "Migrated to Kilo, pending swift migration (waiting help from IBM)"
>
>
>
> @Fernando - can you tell us if you went with HA in Kilo?
>
>
> BR,
> Seán.
>
>
>
>
>
>
> On Mon, Nov 16, 2015 at 9:27 AM, Murphy Seán (murp) <murp at zhaw.ch> wrote:
>
>
> Hi Fede, all,
>
>
>
>
>
>
>
>
> juno HA is quite stable in our experience. the problems are always related
> to the neutron when you restart a
>
>
> Good to hear.
>
>
>
> node. so rule number one, if you need to restart, use corosynch to call
> out your node. this will do a graceful re-balancing among l3 agents. in
> case of sudden "death" of the node, the problem is not much in that, but
> when you re-attach the node. also in  this case correct management of
> corosynch is the trick.
>
>
> Thanks for the pointers - I may ask for more info on the confcall as I
> don't fully
> get the point here. Also, it would be good to know if this also applies to
> Kilo.
>
>
>
> In case you have not noticed, following the new dow in FI-CORE and the
> Open Call, requirements on SLA and availability are quite strict, so if
> your node dies because the only controller you have is un-recoverable, and
> because of that you breach the required  availability threshold, this may
> have financial implications for FI-CORE nodes.
>
>
> Thanks for pointing that out. I guess everyone has a strong interest in
> having the
> systems as reliable as possible - unreliable systems give lots of
> headaches. I guess
> what I was interested in knowing is whether HA is likely to make the
> system more
> reliable or less reliable: the experience in XiFi was that it seemed to
> make things
> less reliable.
>
>
> BR,
> Seán.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Br,
>
>
> Federico
>
>
> --
>  Future Internet is closer than you think!
>  http://www.fiware.org
>
>  Official Mirantis partner for OpenStack Training
>  https://www.create-net.org/community/openstack-training
>
>  --
>  Dr. Federico M. Facca
>
>  CREATE-NET
>  Via alla Cascata 56/D
>  38123 Povo Trento (Italy)
>
>  P  +39 0461 312471
>  M  +39 334 6049758
>  E  federico.facca at create-net.org
>  T @chicco785
>  W  www.create-net.org
>
>
>
>
>
> On Fri, Nov 13, 2015 at 11:54 AM, Theofanis Katsiaounis <
> th_katsiaounis at neuropublic.gr> wrote:
>
> Hi all,
>  Indeed Kilo could solve the network issues since networking is HA
>  capable too.
>  Containers/Swift can be a problem especially since you have to leave
>  space to create the storage rings etc.
>
>  Regards,
>  Fanis
>
>  On 13/11/2015 12:50 μμ, Cristian Cristelotti wrote:
>
>
> > Hi Sean,
>  >
>  > Our experience with Grizzly (HA) was very bad. IceHouse (HA) was better
> but not stable . Now we are with JUNO on single-node and we haven't faced
> any problem .
>  > We are working on the migration to KILO (HA + murano + ceilometer ).
>  >
>  > KILO seems to have solved the problems mentioned by Fanis.
>  > If you'll not deploy the node with HA you'll not have containers
> functionality or better you have to install swift manually after fuel
> deployment.
>  >
>  >
>  >
>  > Regards
>  >
>  > Cristian
>  >
>  > ----- Messaggio originale -----
>  > Da: "Sean Murphy" <murp at zhaw.ch>
>  > A: "Theofanis Katsiaounis" <th_katsiaounis at neuropublic.gr>
>  > Cc: fiware-lab-federation-nodes at lists.fiware.org
>  > Inviato: Venerdì, 13 novembre 2015 11:40:13
>  > Oggetto: Re: [Fiware-lab-federation-nodes] [CESNET #134122] Re:
> experiences with HA
>  >
>  >
>  >
>  > Hi all,
>  >
>  >
>
>
> > So the feedback so far is the following:
>  > - Riwal says that running Juno/HA is not so problematic, but has not
> had a specific failure
>  > situation where HA could really be tested
>  > - Fernando notes that Juno/HA exhibited stability problems for larger
> numbers of users and
>  > decided against it
>  > - Fanis notes that Icehouse/HA was quite problematic in multiple
> respects
>  >
>  >
>  > >From our pov, this is not painting a v positive picture regarding HA
> and despite
>  > our inclination to experiment with newer technologies we would prob opt
> not to
>  > use HA.
>  >
>  >
>  > Does anyone in the project have Kilo/HA experience?
>  >
>  >
>  > BR,
>  > Seán.
>  >
>  >
>  >
>  >
>  >
>  >
>  > On Fri, Nov 13, 2015 at 10:38 AM, Theofanis Katsiaounis <
> th_katsiaounis at neuropublic.gr > wrote:
>  >
>  >
>  >
>  > Hi all,
>  > we had HA on Icehouse and it was a mess. Especially with the
> Networking/Neutron part. Namespaces were not transfered between nodes so if
> one went down vm's lost networking. Reboots were a lottery indeed,
> sometimes they worked sometimes  they did not. And when we lost power once
> i had to rebuild the node.
>  > Of course the FIWARE lab handbook asks for an HA solution but i see in
> the case of Spain this has already been violated ;).
>  > My two cents is that the guys from Spain made the right choice. I do
> not think HA in openstack is ready for production especially with a big
> number of users.
>  >
>  > Regards,
>  > Fanis
>  >
>  >
>  > On 13/11/2015 11:33 πμ, Riwal KERHERVE wrote:
>  >
>  >
>  >
>  >
>  >
>  > Sean,
>  >
>  >
>  >
>  > In Grizzly, anytime we needed to restart processes handled by CRM, it
> was a lottery. Sometimes, everything went fine and sometimes the processes
> keep on rebooting and it take us hours to put back things in order.
>  >
>  > In Juno, we never experienced this kind of behavior. When we needed to
> restart processes trough CRM, all always went fine.
>  >
>  >
>  >
>  > To answer to your question:
>  >
>  > The only time, we played with HA, it was to take into account some
> modification in our configuration files. I do not recall exercising HA
> capabilities, like the need of putting one node down and switching all
> processes to the other node.
>  >
>  >
>  >
>  > BR
>  >
>  > Riwal
>  >
>  >
>  >
>  > De : sean at gopaddy.ch [ mailto:sean at gopaddy.ch ]  De la part de Sean
> Murphy
>  > Envoyé : jeudi 12 novembre 2015 17:01
>  > À : Riwal KERHERVE
>  > Cc : fiware-lab-federation-nodes at lists.fiware.org
>  > Objet : Re: [CESNET #134122] Re: [Fiware-lab-federation-nodes]
> experiences with HA
>  >
>  >
>  >
>  >
>  >
>  >
>  > Hi Riwal,
>  >
>  >
>  >
>  >
>  >
>  > Good feedback - thanks for that.
>  >
>  >
>  >
>  >
>  >
>  > As a matter of interest, have you ever needed to exercise any of the HA
>  >
>  >
>  > capabilities or have you tested it in anger?
>  >
>  >
>  >
>  >
>  >
>  > BR,
>  >
>  >
>  > Seán.
>  >
>  >
>  >
>  >
>  >
>  > On Thu, Nov 12, 2015 at 4:51 PM, Riwal KERHERVE via RT <
> xifi-support at rt.cesnet.cz > wrote:
>  >
>  > Sean,
>  >
>  > I do not have experience with Kilo in HA, but our node is in Juno and
> in HA. We installed it with fuel 6.0 (2 controllers and 1 Arbitrator)
>  . We never have any trouble until now: very stable, nothing to be with HA
> in grizzly.
>  >
>  > BR
>  > Riwal
>  >
>  > De : fiware-lab-federation-nodes-bounces at lists.fiware.org [mailto:
> fiware-lab-federation-nodes-bounces at lists.fiware.org ]  De la part de
> Sean Murphy
>  > Envoyé : jeudi 12 novembre 2015 16:33
>  > À : fiware-lab-federation-nodes at lists.fiware.org
>  > Objet : [Fiware-lab-federation-nodes] experiences with HA
>  >
>  >
>  >
>  >
>  > Hi all,
>  >
>  > We're looking at our upgrade strategy and we're curious to
>  > hear any experience with Kilo HA both from the deployment
>  > perspective as well as the operations perspective.
>  >
>  > >From xifi, I remember Fanis reporting a split-brain scenario
>  > with HA and in the end he opted not to go with a HA solution;
>  > this gives me pause for thought when considering this
>  > deployment solution, even though it seems to be the
>  > preferred solution.
>  >
>  > Generally, we would be well disposed to a HA deployment
>  > as we would like to learn about it, but we do not want to
>  > end up deploying a technology that is too far from production
>  > readiness.
>  >
>  > Does anyone have any experience that they can share on this
>  > point?
>  >
>  > BR,
>  > Seán.
>  >
>  >
>  >
>  >
>  >
>  > _______________________________________________
>  > Fiware-lab-federation-nodes mailing list
> Fiware-lab-federation-nodes at lists.fiware.org
> https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>  >
>  >
>  > Αποποίηση ευθυνών / Disclaimer
>  >
>  >
>  > _______________________________________________
>  > Fiware-lab-federation-nodes mailing list
>  > Fiware-lab-federation-nodes at lists.fiware.org
>  >  https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>  >
>
>
>
>  Αποποίηση ευθυνών / Disclaimer
>
>
>
>  _______________________________________________
>  Fiware-lab-federation-nodes mailing list
>  Fiware-lab-federation-nodes at lists.fiware.org
>  https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>
>
>
>
>
>  _______________________________________________
>  Fiware-lab-federation-nodes mailing list
>  Fiware-lab-federation-nodes at lists.fiware.org
>  https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>
>
>
>
>
>  --
>
>
>
>
> --------------------------------------------------------
>  Giuseppe Cossu
>  CREATE-NET
>  Smart Infrastructures
>  Research Engineer
>  Via alla Cascata 56/D - 38123 Povo Trento (Italy)
>  e-mail: giuseppe.cossu at create-net.org
>  Tel:  (+39) 0461312428
>  www.create-net.org
>  --------------------------------------------------------
>
>
>  _______________________________________________
> Fiware-lab-federation-nodes mailing list
> Fiware-lab-federation-nodes at lists.fiware.org
> https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>
>
>
> ----------------
>
>  Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario,
> puede contener información privilegiada o confidencial y es para uso
> exclusivo de la persona o entidad de destino. Si no es usted. el
> destinatario indicado, queda notificado de que la  lectura, utilización,
> divulgación y/o copia sin autorización puede estar prohibida en virtud de
> la legislación vigente. Si ha recibido este mensaje por error, le rogamos
> que nos lo comunique inmediatamente por esta misma vía y proceda a su
> destrucción.
>
>  The information contained in this transmission is privileged and
> confidential information intended only for the use of the individual or
> entity named above. If the reader of this message is not the intended
> recipient, you are hereby notified that any dissemination,  distribution or
> copying of this communication is strictly prohibited. If you have received
> this transmission in error, do not read it. Please immediately reply to the
> sender that you have received this communication in error and then delete
> it.
>
>  Esta mensagem e seus anexos se dirigem exclusivamente ao seu
> destinatário, pode conter informação privilegiada ou confidencial e é para
> uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o
> destinatário indicado, fica notificado de que a  leitura, utilização,
> divulgação e/ou cópia sem autorização pode estar proibida em virtude da
> legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos
> o comunique imediatamente por esta mesma via e proceda a sua destruição
>
>
>
>
>  --
>
>
>
>
> --------------------------------------------------------
>  Giuseppe Cossu
>  CREATE-NET
>  Smart Infrastructures
>  Research Engineer
>  Via alla Cascata 56/D - 38123 Povo Trento (Italy)
>  e-mail: giuseppe.cossu at create-net.org
>  Tel: (+39) 0461312428
>  www.create-net.org
>  --------------------------------------------------------
>
>
>
> ----------------
>
>  Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario,
> puede contener información privilegiada o confidencial y es para uso
> exclusivo de la persona o entidad de destino. Si no es usted. el
> destinatario indicado, queda notificado de que la  lectura, utilización,
> divulgación y/o copia sin autorización puede estar prohibida en virtud de
> la legislación vigente. Si ha recibido este mensaje por error, le rogamos
> que nos lo comunique inmediatamente por esta misma vía y proceda a su
> destrucción.
>
>  The information contained in this transmission is privileged and
> confidential information intended only for the use of the individual or
> entity named above. If the reader of this message is not the intended
> recipient, you are hereby notified that any dissemination,  distribution or
> copying of this communication is strictly prohibited. If you have received
> this transmission in error, do not read it. Please immediately reply to the
> sender that you have received this communication in error and then delete
> it.
>
>  Esta mensagem e seus anexos se dirigem exclusivamente ao seu
> destinatário, pode conter informação privilegiada ou confidencial e é para
> uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o
> destinatário indicado, fica notificado de que a  leitura, utilização,
> divulgação e/ou cópia sem autorização pode estar proibida em virtude da
> legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos
> o comunique imediatamente por esta mesma via e proceda a sua destruição
>
>
> _______________________________________________
> Fiware-lab-federation-nodes mailing list
> Fiware-lab-federation-nodes at lists.fiware.org
> https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>
>
> Αποποίηση ευθυνών / Disclaimer
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.fiware.org/private/fiware-lab-federation-nodes/attachments/20151119/b968a25a/attachment.html>


More information about the Fiware-lab-federation-nodes mailing list

You can get more information about our cookies and privacy policies clicking on the following links: Privacy policy   Cookies policy